
【マット・ウルフのAIニュース】英語解説を日本語で読む【2023年11月25日|@Matt Wolfe】

初めに、Open AIでのトップの交代や研究者の退職に関するドラマがあったこと、新しいモデル「Q-Star」の開発とその数学問題解決能力の向上が紹介されています。また、Q-Starの商業化に関する内部の意見対立や、暗号化情報の解読に関する潜在的なリスクについても触れています。さらに、大規模言語モデルの発展、AIアート分野でのStability AIのStable Video Diffusionの発表、ユーザーに好まれていること、AI音声クローニング技術の進展、Luma AIによるテキストから3Dオブジェクトへの変換技術の向上、Google Meetの新機能、AI著作権訴訟の結果、および膵臓がん検出のための新しいディープラーニングモデル「PANDA」の使用についても説明されています。

This week was an absolutely insane week in the world of AI.


I stopped saying that every single week because I know it's lost its meaning.


This week actually insane, and I'm not even just talking about all of the drama that went down at OpenAI.


So much other stuff happened that kind of flew under the radar because all of that drama was happening.


But it was one of the most insane weeks we've ever seen in terms of new advancements in the world of AI.


So, I'm going to break it all down for you right now.


I obviously can't talk about the past week in AI without mentioning all of the OpenAI Sam Altman drama.


I'm not going to go deep into this in this video because I've literally made four videos about it already.


You can watch one here, one here, one here, and one over here.


But the super TLDR is that on November 17th, OpenAI announces a leadership transition.


They fire Sam Altman, they demote Greg Brockman off the board, Greg Brockman leaves, a bunch of researchers leave, pretty much everybody at OpenAI threatens to leave if they don't bring Sam Altman back.


All of the investors, including Microsoft, are pissed.


They offer Sam Altman and Greg Brockman a job at Microsoft.


With everybody at OpenAI threatening to leave, two CEOs refusing to stay in that position, Mira Moradi and Ilya Sutskever, and all the investors pushing to bring Sam back.

OpenAIの全員が去ると脅し、2人のCEO、Mira MoradiとIlya Sutskeverがその地位に留まることを拒否し、すべての投資家がサムを呼び戻すよう後押しした。

Four days later, on November 21st, Sam was reinstated as the CEO of OpenAI, Greg Brockman's back, the researchers are back, everybody comes back, and pretty much the entire original board is out, except for Adam D'Angelo.

4日後の11月21日、SamはOpenAIのCEOに復職しました。Greg Brockmanも戻ってきて、研究者たちも戻ってきて、みんな戻ってきて、ほぼ元の取締役会の全員が退任しましたが、Adam D'Angeloだけは残っています。

As of right now, there are just three board members: Brett Taylor, Larry Summers, and Adam D'Angelo, with it being expected that in the coming weeks or months, more members will be added to the board, most likely including some representation from Microsoft, OpenAI's biggest investor.

今現在、役員は3人だけです: ブレット・テイラー、ラリー・サマーズ、アダム・ダンジェロの3人で、今後数週間か数ヶ月のうちに、OpenAIの最大の出資者であるマイクロソフトからの代表を含む、より多くのメンバーが理事会に加わることが期待されている。

Again, that's a super simplified version, but I've made a bunch of videos about it, so watch those if you really want to deep dive into all of the drama that happened, because that in itself was pretty crazy.


For the most part, that's where the story ended.


Now, there's been a little bit of news that came out after that, first reported on by The Information.

その後、The Informationが最初に報じた後、少しニュースが出ました。

OpenAI made an AI breakthrough before Altman's firing, stoking excitement and concern.


The rumor is that the researchers built a new model called Q-Star that was able to solve math problems that it hadn't seen before.


And the speculation is that the reason that the board wanted to get rid of Sam was because they didn't want the commercialization of this Q-Star, and they didn't want it to get into the hands of regular consumers because it's potentially too powerful.


The following day, Reuters, a very respectable news site, claimed the same information.


Ahead of OpenAI CEO Sam Altman's four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery they said could threaten humanity.


Two people familiar with the matter told Reuters.


Reuters was unable to review a copy of the letter.


The staff who wrote the letter did not respond to requests for comment.


Reuters contacted OpenAI, and they declined to comment.


And there really hasn't been a lot of verification yet, which is why I really haven't done a full breakdown of what is Q-Star.


Though only performing math on the level of a grade school student, acing such tests made researchers very optimistic about Q's future success.


The source said.


Now, you might be wondering if this is only able to perform math on the level of grade school students, why is it that big of a deal?


Well, here's the thing.


So far, large language models aren't great at math.


They're great at predicting the next word in a piece of text, but not so much math.


However, with something like Q-Star that can perform math and also learn how to get better and better at math, this leads to potential massive problems for humanity.


Overall, the reason being that encryption is such a huge part of everything we do online.


Our personal data, our usernames, our passwords, our financial data, all of this uses encryption.


The stronger the encryption, the harder it is for computers and hackers to break that encryption and get the information contained within the encrypted data.


However, if these algorithms get better and better at math, they will likely also be better and better at decrypting all of this encryption.


The way the entire web is built by using encrypted information to keep hackers out and to keep computer programs from accessing all of this data, well, something like Q-Star, as it advances, could get better and better and better at basically hacking this data and unencrypting this information, which could potentially cause widespread chaos.


And that's just one of the implications of this potential Q-Star that they're talking about here and why some people probably don't want it in the hands of just anybody.


Now, this is a gross oversimplification of the potential impact of all of this, but if this technology isn't safely locked down and it gets better and better at math, the implications are huge and very scary.


Again, I haven't really reported on this in any of my videos because so far it's all been sort of hearsay.


A friend of a friend told their mom that they're doing this, right?


Nobody from OpenAI has commented.


All of the sources are anonymous.


There's really no real verification other than the fact that I do think that both Reuters and The Information are fairly credible and thorough with their research.


As more information comes out and we do get confirmation, I will definitely be making more content about this.


I just want something a little more concrete than what we've been given so far.


There's also this statement from The Verge that came out on November 26th that says a recent OpenAI breakthrough on the path to AGI has caused a stir.

また、11月26日に『The Verge』から発表された、AGIへの道筋を示す最近のOpenAIのブレークスルーが波紋を呼んでいるという声明もある。

Of course, it's talking about Q-Star here and how this could be a step toward creating artificial general intelligence.


However, if we read the rest of this little statement here, it says, After the publishing of the Reuters report, which said senior exec Mira Moradi told employees that the letter about Q-Star precipitated the board's actions to fire Sam Altman last week, OpenAI spokesperson Lindsay Held Bolton refuted the notion in a statement shared with The Verge.

しかし、この小さな声明の残りを読むと、次のように書かれている。 ロイターの報道によると、上級幹部ミラ・モラディが従業員に対し、Q-Starに関する書簡が先週サム・アルトマンを解雇する取締役会の行動を促したと語ったということだが、OpenAIの広報担当リンジー・ヘルド・ボルトンは、The Vergeと共有した声明の中で、この考え方に反論した。

Mira told employees what the media reports were about, but she did not comment on the accuracy of the information.


Separately, a person familiar with the matter told The Verge that the board never received a letter about such a breakthrough, that the company's research progress didn't play a role in Altman's sudden firing.

これとは別に、この件に詳しい人物がThe Vergeに語ったところによると、取締役会がそのようなブレークスルーに関する書簡を受け取ったことはなく、同社の研究の進展がアルトマンの突然の解雇に一役買ったわけではないという。

So, at the moment, we've got anonymous sources, no confirmation from OpenAI.


In fact, we even have a denial from OpenAI.


And to be quite honest, we don't have a lot to work off of.


But trust me, it's something I'm paying very close attention to and will likely talk about more as more information comes out.


Amidst all of the chaos that was happening at OpenAI, at a time when Greg Brockman didn't even know if he was going to have a job there anymore, he was still announcing new features, including the fact that ChatGPT with voice was now available to all the free users of ChatGPT.


Now, if you're a plus user, you've had it for a while.


You can open up your phone, talk to ChatGPT, it responds to you.


It's sort of a better version of what you'd get out of Siri.


Now, that's available in the free version as well.


And it blows my mind that even among all of the turmoil, somehow OpenAI continued to ship a new feature.


And Greg Brockman, the ex-president, the ex-chairman of the board, was making the announcements about it, even though at the time he wasn't even technically part of the company anymore.


Again, I've made a ton of content about the OpenAI drama.


Definitely check out those videos.


But for the rest of this video, let's break down all of the news that flew under the radar while all the OpenAI drama was happening.


On Wednesday, November 22nd, while all of this was happening, Inflection AI released a brand new model that they claim is the second best model behind GPT-4.

11月22日(水)、すべてが起きている間に、Inflection AIはGPT-4に次ぐ最高のモデルであると主張する全く新しいモデルをリリースした。

The startup behind the chatbot Pi says its Inflection 2 model outpaces popular alternatives from Google and Meta and is catching up to OpenAI's flagship fast.

チャットボットPiを開発したスタートアップは、そのInflection 2モデルは、GoogleやMetaの人気のある代替モデルを凌駕し、OpenAIのフラッグシップモデルに急速に追いつきつつあると述べている。

Inflection 2 performed better than Google's PaLM model as well as beat the open-source LAMA 2 model.

Inflection 2は、グーグルのPaLMモデルよりも優れたパフォーマンスを示し、オープンソースのLAMA 2モデルにも勝った。

Overall, Inflection's model is the top performing of its size.


It only trails GPT-4, which is thought to be significantly larger.


Now, we don't have access to this yet.


The newly released model will soon be integrated into Pi, the chatbot that Inflection released in May.


But first, it needs a bit more extra work known as alignment to teach it Pi's tone and answering style and to help Pi function better while absorbing up-to-date information without additional hallucinations.


Solomon said.


We can see here with the light green bars how it performed on a handful of benchmark tests, comparing it to PaLM 2 and Inflection 1.

PaLM2やInflection 1と比較し、いくつかのベンチマークテストの結果を薄緑の棒グラフで見てみよう。

And you can see in all of these various benchmarks, essentially tests that it gave the chatbot, it outperformed all of these other models.


Interestingly, they didn't put GPT-4 on here to see how it benchmarked against those.


Again, it's something we don't have access to yet as consumers.


Once again, as we do get access and they do roll it out into their Pi bot, we will definitely be exploring it some more and playing around with it and seeing what it can do.


But Inflection wasn't the only one who released a new model this week.


Anthropic introduced Claude 2.1.

AnthropicはClaude 2.1を発表した。

Among all of the chaos that was going on at OpenAI, if you've used Claude before, you know that Claude is really good at taking long documents and summarizing them into shorter documents.


Well, previously, it had a 100,000 context window, meaning that it was able to ingest and respond with roughly 75,000 words.


Well, Claude 2.1 here upped that to a 200,000 context window.


So this doubles the amount of input and output combined we can get from Claude.


Now we can use about 150,000 words, which is over 500 pages of material.


This new model also has a two times decrease in hallucination rates, more functionality to the API, and a better developer experience.


But really, the main thing we care about is this 200,000 context window and the fact that it's going to hallucinate a heck of a lot less.


We're going to be able to put in a lot of information, like full textbooks, and ask questions about it and get responses.


Now, with Anthropic, Greg Brockman here on X actually did a breakdown testing Claude 2.1 and basically found that where the information is located within the text matters.


Facts at the very top and very bottom of the document were recalled with nearly 100% accuracy.


Facts' position at the top of the document were recalled with less performance than the bottom.


Starting at about 990,000 tokens, performance of recall at the bottom of the document started to get increasingly worse.


Even though it is a large context window, your facts are not guaranteed to be retrieved, and that position within the text matters.


Claiming that facts placed at the very beginning and the second half of the document seem to be recalled better.


He gave a lot more details into the findings from his Claude 2.1 tests, and like always, I'll make sure it's linked up in the description below so that you can review it if you'd like.


In some other exciting news, Elon Musk said that the xAI chatbot Grok is going to launch to Premium Plus subscribers next week.


That would make it the week of November 27th.


Sometime during that week, we should be getting access.


You can see that Elon actually put it in a reply to somebody else's tweet.


Tesla owner Silicon Valley X and Grok combined is going to be mind-exploding.


And then Elon Musk wrote, Yeah, Grock should be available to all X Premium Plus subscribers next week.


We also got a little sneak peek this week of what Gro could look like inside of the app.


You can see there's a new little SL FL icon here and then a text box that says ask Gro.

新しい小さなSL FLのアイコンと、ask Groと書かれたテキストボックスが見えます。

So we'll just have to wait till next week and hopefully we can get our hands on it.


I'm really excited to make a breakdown video where I compare GPT-4, clae 2.1, inflection 2, Grock, and even Bard and figure out what each one is specifically best at.

GPT-4、clae 2.1、inflection 2、Grock、そしてBardを比較し、それぞれが特に何が得意なのかを解明するブレイクダウン・ビデオを作るのがとても楽しみです。

That's a video that I'm really excited to make, but I'm just waiting to get access to all of these various different tools so that we can do that breakdown.


Speaking of Bard, Google made some updates to Bard this week.


I told you this week has been crazy.


There's been updates for every single large language model.


Pretty much in this latest update to Bard, Bard can now watch YouTube videos for you.


Bard's YouTube extension can now handle complex queries about specific video content, like recipe quantities and instruction summaries.


It says the bot's YouTube integration is getting a handy upgrade so it can analyze individual videos to surface specific information for you, things like key points or recipe ingredients, without ever even pressing play.


This article goes on to say it's not perfect at the moment.


The feature only exists as an opt-in Labs experience, and it does take a little bit of work to get the answer you're looking for.


For example, this person asked for a full recipe from a video, but Bard wasn't able to generate anything.


But then asking it for step-by-step instructions on a subsequent prop got the whole recipe.


Now, if I log into my Bard account here, I click on my little extensions button, you can see there is a YouTube extension that's already flipped on here.


And since this excellent video honestly, everybody who's interested in learning about large language models needs to watch this video.


But since this video was released this week, I want to plug this in and see how well it does with an hour-long video.


So this is a video from Andre Karpathy here.


It's called 1H hour talk intro to large language models.

1H hour talk intro to large language modelsというタイトルです。

Excellent breakdown.


And we'll jump into Bard, create a new chat.


I just pasted the URL to the YouTube video in here, and it says, I have watched the video, and here's a summary: The video is an introduction to large language models.


LLMs are a type of artificial intelligence, blah blah blah.


In the video, the speaker discusses the capabilities of LLMs.


Overall, the video provides a good overview of LLMs and their potential application.


So if I do something like, Give me the bullet points of all the main topics covered in this video, let's see what we get.


I actually got this message that says, Query unsuccessful, but then some bullets anyway.


What are large language models?


How are LLMs trained?


What are the capabilities?


How are they evolving?


What are the challenges of working with LLMs?


What are potential applications of LLMs?


So it seems to have found bullets anyway, but I don't know if it pulled that from the video or if it actually pulled it from like the timestamps here that are all available in the description.


So like the original article said, still leaves a little bit to be desired, but we'll explore this more in future videos because this is a feature I really want to see work.


Although, it does scare me a little bit because what's to stop people from just getting summaries of my videos instead of watching the whole thing?


Hopefully, you're watching it because you enjoy me talking about it, and you don't want just the bulleted list.


But I understand if you want to save time.


Continuing on in the world of large language models, once again amid all of the chaos this week, Microsoft released its Orca 2, which is a pair of small language models that actually outperform their larger counterparts.

大規模な言語モデルの世界で続けて、今週の混乱の中で、MicrosoftはOrca 2をリリースしました。これは、実際にはより大きなモデルよりも優れています。

This model comes in two sizes, 7 billion and 13 billion parameters, and it either matches or outperforms models like Meta's LLaMA 2 Chat 70B.

このモデルには70億と130億のパラメータがあり、MetaのLLaMA 2 Chat 70Bのようなモデルに匹敵するか、上回る。

So that's a 70 billion parameter model that it was trained on, and it's actually matching or outperforming with only 7 billion or 13 billion parameters.


Less parameters means a lot faster to train, a lot faster to get responses, a lot less expensive to use, and likely you can run a model like this locally on your own computer.


Microsoft has open-sourced both new models for further research on the development and evaluation of the smaller model.


Looking at their benchmark results here, we're looking at the light blue and the dark blue, the two lines on the left, and you can see that 7B and 13B pretty much outperformed all these other models like LLaMA 2 Chat 13B, LLaMA 2 Chat 70B, WizardLM 13B, and WizardLM 70B.

このベンチマークの結果を見ると、水色と紺色、左の2本の線を見ているのですが、7Bと13BがLLaMA 2 Chat 13B、LLaMA 2 Chat 70B、WizardLM 13B、WizardLM 70Bのような他のモデルよりもかなり優れていることがわかります。

So with these smaller parameter models, it is outperforming the much larger parameter models, which are a lot more expensive and time-consuming to train.


But we got more wild announcements this week in the large language model world.


We have the Q-Star stuff, we've got inflection, we've got Anthropic, we've got Bard, and we've got Microsoft's Orca 2, all making waves in the same week that this OpenAI drama was happening.


But we also got some really cool stuff in the AI art world, including Stability AI announcing stable video diffusion.

しかし、AIアートの世界では、Stability AIが安定したビデオ拡散を発表するなど、本当にクールなものもありました。

Now, stable video diffusion appears to be very similar to what we get out of Runway Gen 2 or Pabs, where we can get like, you know, three 4-second short generations, and they're roughly the same quality, maybe a slight improvement.

今、安定したビデオ拡散は、ランウェイGen 2やPabsから得られるものと非常に似ているようです。3つの4秒の短い世代を得ることができ、ほぼ同じ品質で、わずかな改善があるかもしれません。

They are using Stability AI's SDXL to generate the initial images that then become animated.

彼らはStability AI社のSDXLを使用して初期画像を生成し、それをアニメーション化している。

But it doesn't feel like a huge leap yet above what we're getting out of Pika and Runway Gen 2.

しかし、私たちがPikaやRunway Gen 2から得ているものよりも、まだ大きく飛躍しているようには感じられない。

However, they did do some tests against them.


I don't totally understand what these charts are showing us.


It says here that through external evaluation, we have found these models surpass the leading closed models in user preference studies.


So they must have been doing like polls to the audience of like, Here's a video and here's a video, which one do you like better?


and sort of getting feedback on which ones people liked better.


And with the stable video diffusion 14 frames (SVD), it seemed to tie Runway Gen 2 and outperform Paa.

そして、安定したビデオ拡散14フレーム(SVD)を使用すると、ランウェイGen 2と同点になり、Paaを上回ったようだ。

And the same amount for Runway using the 25 frames model, Stability outperformed both Runway and Paa.


But you can actually use this model for free right now.


I got two really good resources for you.


There's a Hugging Face space available right now, which uses stable video diffusion to do image to video.

今ならHugging Faceスペースが利用可能で、安定したビデオ拡散を使って画像からビデオに変換します。

So you can drag and drop an image here and get a video outputted here.


But this one does not appear to use any sort of text input for it.


I drag this little wolf image here that I created in the past over into Hugging Face and click generate.

過去に作成したこの小さなオオカミの画像をHugging Faceにドラッグして、generateをクリックします。

I get a message that there's a long queue of requests pending.


I am 10 of 10, and it looks like it's going to take about 300 seconds.


So while I'm waiting for that, let me show you the other option that is currently available, which theoretically should be quite a bit faster.


If you head over to decoherence doco, they actually have a stable video free preview over here.

デコヒーレンス・ドコ(decoherence doco)に行けば、安定した無料プレビュー動画がある。

They've got this little race car image here, experiment with the research preview of stable video free for everyone.


This one has both an image to video and a text to video option, and we have a little slider to add less motion or more motion.


A rocket ship blasting off from a Launchpad.


I put the motion to about 3/4, and let's generate the video.


This looks like it's probably going a little bit faster over here.


You can see now I'm Q7 of n.


That's still taking a while.


So the decoherence version took about 1 minute here, and we got a 3-second video.


Let's take a peek.


We've got our rocket ship here, and it looks like all the smoke is animating, but nothing about the rocket ship itself is actually animating.


We're just getting this smoke, but it was pretty dang quick to do it.


Let's check in on Hugging Face.


I am now Q5 of 8.


Let's try the image to video over here.


I'm going to plug in this image here of a sort of river or lake with a tree and some clouds.


Let's go ahead and just generate this and see what it does for us.


Once again, took less than a minute.


Take a peek here.


Not bad.


You can see the water ripples.


There's some camera motion.


The clouds are moving a little bit.


Came out pretty solid.


Checking back to Hugging Face, I am now Q2 out of 10.


We've been going for 310 seconds, and now my estimated time is up to 360 seconds.


So still waiting on this one.


This one took a full 8 minutes to generate after waiting in queue, and then finally processing, and we got a 3-second video that looks like this, and it just kind of spins and warps and twists it into nothing.


That's the new stable video diffusion.


I personally recommend checking it out over at decoherence doco.

個人的には、decoherence docoでチェックすることをお勧めする。

You'll bypass a lot of the queue if you want to do it on Hugging Face.

Hugging Faceでこれをやりたいなら、多くの待ち行列を回避できるだろう。

Staying in the vein of AI video, this week Runway Research finally released their Gen 2 motion brush.

AIビデオに関連して、今週Runway ResearchがついにGen 2モーションブラシをリリースした。

We teased it in last week's video and got a peek of what it'll look like.


This week, we actually got access to it.


I found this Twitter thread here from Minoy of a bunch of different people showing off their generations that they made with this new motion brush with Gen 2.

MinoyのTwitterスレッドで、Gen 2の新しいモーション・ブラシで作ったジェネレーションを披露している人たちを見つけました。

I will link this up in the description below so you can see some other examples of what other people have done with it.


But personally, I want to get in and play with it myself.


So jumping into Runway here, I get this popup that says motion brush is now available.


Motion brush is now available in image to video.


Just paint an area or subject in your image, choose a direction, and your movement intensity value.


Let's go ahead and try it.


So I just uploaded an image here that I generated with Midjourney.


Let's go ahead and click on motion brush, and you can see I've got like another little river here with some clouds and a tree.


Go ahead and make our motion brush a little bit bigger, and let's see if we can get it to just kind of affect the clouds up here.


Let's see, I want some horizontal movement.


Let's make it moving to the left, and I'll leave the vertical and proximity movement the way they are.


And let's click save and let's just go ahead and generate, see what it gives us.


Well, it took less than a minute to generate it.


Let's see what it did.


That's actually really, really good.


One thing I'm really noticing is how much the light is impacted.


You can see the sun sort of streaking through the clouds here.


And then, you can actually see the reflection on the water changing as the clouds move.


And if you look at the tree, the tree has no motion to it.


The ground has no motion to it, it just affected the clouds and the reflection in the lighting.


Like this is actually a lot better than I was expecting.


Like I'm pretty blown away by this.


I actually want to try one more.


So, I uploaded this image of a girl with like butterflies around her and I want to see if I can get some motion on just the butterflies.


So, I'm just going to go ahead and mask only the butterflies.


And let's do proximity movement or maybe they're getting closer and maybe a little bit of vertical movement.


And I'm going to leave the horizontal movement alone on this one.


And let's see what we get out of this.


But they're not flapping their wings like butterflies, they're more kind of falling.


But if you look at it, it did a great job.


The girl in the middle is staying consistent, the ring around her is staying consistent.


None of these flowers or anything are moving.


The only things that are moving are the butterflies that I highlighted, which I think maybe it thought was flowers or something.


I don't know, but it looks pretty good.


It's only animating the stuff that I highlighted and not the stuff I didn't want it to highlight.


We also got some advancements this week in the world of AI voice cloning.


ElevenLabs, probably the best text to speech generator on the market right now, announced that this week they now have speech to speech.


So, you can train ElevenLabs on your own voice or custom voices.


You can use one of ElevenLabs' existing voices and just like before, you can type in text and get it to say out that text in that voice.


But also, let's say you wanted to give certain inflections on your voice or speak in a certain tone or emphasize certain words.


Now, you can actually speak and it will take what you spoke and change it to the voice that you chose.


When I log into ElevenLabs here, you can see that there are new tasks up at the top.


We've got text to speech and speech to speech.


At one time, I had some fun kind of training an Owen Wilson style voice into this.


And I can do a text to speech where it says, Subscribe to Matt Wolf on YouTube.

テキストを音声に変換して、「YouTubeでMatt Wolfを購読してください」と言うこともできます。

And it sounds like this: Subscribe to Matt Wolf on YouTube.

こんな風に聞こえます: YouTubeでマット・ウルフを購読してください。

I mean, I could kind of tell that's Owen Wilson's voice, but I wish he put more emphasis on certain words.


So, let's see what happens if I switch to speech to speech, leave it on this Owen Wilson voice, and let's record some audio.


Subscribe to Matt Wolf on YouTube.


So, I kind of put the emphasis on different words.


And let's go ahead and click generate.


And this is what it sounds like: Subscribe to Matt Wolf on YouTube.

こんな感じです: YouTubeでマット・ウルフを購読する。

So, you can see it follows the inflections and the emphasis that I put when speaking it out.


This just gives us that extra level of control when we want our dialogue to sound exactly like we want it to, but maybe not necessarily in our own voice.


Last week, Luma AI announced their new Genie research preview, which is essentially text to 3D objects.

Luma AIは先週、新しいGenie研究プレビューを発表しました。これは基本的にテキストから3Dオブジェクトへの変換です。

This week, they've announced that they've already improved upon it.


You now have the ability to add negative prompts and control the seed number.


They've also improved the stability and quality.


There's been content style updates to bot messages, and soon you'll be able to use it in other Discord servers.


As a quick reminder, if you do want to use Genie, you sign up for the Luma AI Discord, you jump into their Discord, pick one of these Genie rooms here, you type /Genie here.

もしGenieを使いたいなら、Luma AIのDiscordにサインアップして、そのDiscordに飛び込んで、ここにあるGenieルームのひとつを選んで、ここに/Genieと入力してください。

And then you enter a prompt like a monkey holding a taco.


And then, if I click on this two more here, you can see we now have options for seed and negative prompts.


I'll go ahead and add a negative prompt, say ugly, disfigured, and distorted.


I don't know how much that's going to improve anything, but I just wanted to remind you how to use this tool.


We hit enter and literally just a few seconds later, like this was insanely fast, I have some monkeys with tacos.


I mean, a couple have tacos sort of built into their heads.


This is pretty dang impressive right here, especially seeing what 3D generations and text to 3D looked like just 3 months ago.


This is a huge, huge advancement.


You can also see that we have a seed number here, so if we want to generate one that looks the same and come back and use the same seed, we can.


A new feature is getting rolled out into Google meet that uses AI to detect when you actually raise your hand on camera.


To raise your hand in the meeting, you can see their little animation here where this lady raises her hand and it says raising your hand.


So, it's using some sort of computer vision model to be able to know when you raise your hand.


It says here it's available across most Google meet workspace plans, but you do have to turn it on.


It is off by default, but can be enabled in the setting.


I guess there was some sort of trend going around where people were making Disney movie posters using Microsoft's AI image Creator, like this kimchi poster here that seemed to have worried Microsoft.


And now Microsoft has disallowed the word Disney in its prompts.


Also, this week Sarah Silverman hit some stumbling blocks with her AI copyright case.


We talked about this a couple months ago.


She was suing, along with some other authors, basically saying that these large language models ingested her entire book and now anybody can just go to a ChatGPT and ask questions and get responses from the book and don't need to buy the book.


Therefore, it infringes on copyright.


The judge went, Nah, get out of here with that.


The generations that this is creating are different enough that it doesn't really compete with your book.


It wasn't recreating anything from the book.


It's just essentially able to answer questions about the book.


In fact, the judge went as far as to say, This is nonsensical.


There's no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiff's books.


To prevail on a theory that LLaMAs outputs constitute derivative infringement, the plaintiffs would indeed need to allege and ultimately prove that the outputs incorporate in some form a portion of the plaintiff's books alleged infringers derivative work must still bear some similarity to the original work or contain the protected elements of the original work.


So basically, if somebody generates a prompt, nothing that the prompt responds to bears enough similarity or chose any of the protected content.


If you watch the past videos where I break down this news when it first came out, this was pretty much the outcome I expected and it's playing out pretty much how I think anybody would have thought it would have played out.


There's not a whole lot of grounds for this lawsuit at the moment.


And finally, in an article from nature.com this week, we learned about a large-scale pancreatic cancer detection via non-contrast CT and deep learning.


Basically, a new deep learning model called panda, which stands for pancreatic cancer detection with AI, can very accurately detect pancreatic lesions, and it can do it from non-contrast CT scans.


Now, I don't totally know what that means, but according to the article, this was something that was very difficult to do even from Radiologists.


This technology seems like it's going to really help Radiologists detect pancreatic cancer a lot sooner than previously been able to.


And like I mentioned in so many of my AI videos, I think the biggest shakeups in the world as a result of AI are going to come in the healthcare space, and things like this are constant proof of that evolution and that proving to be true.


When I think about AI and the fears, I often think about how a lot of the benefits of AI should drastically outweigh a lot of the things that people are scared of, and the advancements into healthcare is one of those areas that I feel very, very strongly about.


That I really think the advancements in this area are going to completely change the world, and that's what I'm so excited about.


So like I said at the beginning of this video, absolutely insane week in the world of AI, even when you take all the OpenAI drama out of the picture.


I don't know exactly at this time what my title said or what my thumbnail looked like, but it wasn't clickbait.


This was a crazy week, and the world will never be the same after this week.


All of the large language models seem to have gotten improvement this week.


There's the possibility that this Q-Star is in the works and can do math with these large language models.


We've got new advancements in AI video.


We've got judges slapping down these lawsuits left and right, and it's just an exciting time in the world of technology.


I'm really pumped about this, and if you get pumped about this stuff as well, check out Future tools.

もしあなたもこのようなことに興奮しているのであれば、Future toolsをチェックしてみてください。

This is where I curate all the coolest AI tools that I come across.


It's updated pretty much every day.


I also keep the AI news page up to date on a daily basis here, and I've got a free newsletter.


Just go to Future tools, click this button, and I'll keep you in the loop directly in your inbox.

Future toolsにアクセスして、このボタンをクリックしてください。

And if you like this video and you want to stay in the loop with the latest AI news, tutorials, and research, give this video a thumbs up and subscribe to this Channel, and I'll make sure more videos like this show up in your YouTube feed.


Thank you once again for tuning into this video and nerding out over the latest AI news with me.


I really, really appreciate you.


Hope you keep coming back for more, and if you do, I'll see you in the next video.



