
【マット・ウルフのAIニュース】英語解説を日本語で読む【2024年1月27日|@Matt Wolfe】

この動画は、最近のAI技術の進展と特にAIビデオ生成技術に関する多くのトピックを紹介しています。TikTokの親会社ByteDanceによる「MagicVideo-V2」はテキストからビデオを生成する改良されたモデルです。Googleの「Lumere」は、テキストや画像からビデオを生成できる一方、スタンフォードとAdobeが共同で開発した「ActAnywhere」は、被写体と背景を組み合わせて動画を作成します。また、Google Chromeの最新AIアップデートは、新しいタブ管理やテーマ作成などの機能を提供します。Googleの「Gemini」ツールは検索広告のカスタマイズを容易にし、「Art Selfie」機能では自分の顔を歴史的な画像に挿入できます。MidjourneyのV6モデルは画像の拡大や修復を簡単にし、ElevenLabsは8000万ドルを調達し、「Dubbing Studio」で音声や動画ファイルの翻訳を可能にしました。さらに、AIによるジョー・バイデンの偽ロボコールが問題になっています。OpenAIはGPT-3.5 Turboの価格を下げ、GPT-4 Turboを更新し、チップ工場の設立計画を立てています。GoogleとHugging FaceはオープンAIコラボを開始し、Apple Podcastは自動トランスクリプト機能を導入、Disneyはメタバース用のプラットフォームを開発しています。

There's been a ton of really cool AI announcements and research that's been coming out lately, and to be quite honest, most of it's been flying under the radar.


Especially some of this really cool stuff that we're getting in the world of AI video generation.


For example, ByteDance, the creator of TikTok, recently released this research paper called MagicVideo-V2.


It's a text-to-video model that is quite improved over text-to-video models we've had previously.


This website has tons and tons of examples of text-to-video prompts, like a walking figure made out of water, Hulk wearing virtual reality goggles, a fat rabbit wearing a purple robe walking through a fantasy landscape, and much, much more.


But if we come down here, we can actually compare it to the existing video models that are available.


So, this left column that we see here is all Magic-V2.


The middle videos are Stable Video Diffusion.

真ん中はStable Video Diffusionです。

And then what we're seeing on the right is, on the top, we've got Pika, and on the bottom, we've got Runway Gen-2.

そして、右側に見えるのは、上にはPikaがあり、下にはRunway Gen-2があります。

MagicVideo-V2 seems to do a pretty decent job and quite a bit better than the alternatives.


Now, Stable Video Diffusion and MagicVideo-V2 look pretty similar.

さて、Stable Video DiffusionとMagicVideo-V2はかなり似ている。

I like the colors better in Magic Video.

私はMagic Videoの方が色が好きだ。

But then, comparing it to Pika and Gen-2, Pika almost looks like a guy wearing a cape and there's no details at all.


Gen-2, the guy appears to be just sort of walking in place or even walking backwards.


And there are just a ton of examples here showing how Magic-V2 compares against the existing platforms.


And if you'd like to take a peek at some of these, as well as get more details about how the actual research works, I will link it up in the description below.


Unfortunately, this isn't a piece of tech that we have access to yet.


We can't actually play with this one yet.


And really, the same goes for this next one called Lumere, which is just on another level when it comes to AI video.


This one comes out of Google Research, and it can do some pretty realistic text-to-video.

これはGoogle Researchから生まれたもので、かなり現実的なテキストからビデオへの変換が可能です。

For example, this prompt right here: a dog driving a car on a suburban street wearing funny sunglasses.


The video is kind of realistic.


Chocolate syrup pouring on vanilla ice cream.


Sailboat sailing on a sunny day in a mountain lake.


A red Lamborghini advented door coming around a bend in a mountain road or one of my favorites: silhouette of a wolf against a twilight sky.


These are truly impressive.


But this Lumiere video doesn't only do text-to-video, it also does image-to-video.


So, for example, if I hover over this one, we could see the original image combined with the prompt: Bigfoot walking through the woods.

例えば、この動画にカーソルを合わせると、元の画像とプロンプトが表示される: 森を歩くビッグフット。

When I unhover, you can see the animation that it created from it.


Up here, we have a turtle swimming.


We can see just a shot of a turtle.


If I move away from it, you can see the turtle sort of looks into the camera, and it's really realistic looking.


It's also capable of matching styles of input images.


So, for example, they gave it this tree sticker-looking image.


And then, gave it the prompt of a colorful parrot showing off its vibrant feathers.


And you can see that it sort of animated a coloring that matches the same style as the input image.


Again, there are just tons of cool examples here where you can see it's matching the style from the image on the left, but then adhering to the prompt that was given.


My favorite is actually this one right here, where you've got this sort of glowing mushroom.


They give a prompt of a lion with a majestic man roaring, and you can see it animates this lion roaring, but it follows the same sort of glowing in the dark look.


It's also capable of cinemagraphs, which is basically where you just take a section of the image and say, I wanted to animate just that piece.


So, for example, they selected just the butterfly.


The rest of the image is still, but the butterfly is animated.


A picture of a bonfire, they selected just the bonfire.


The rest of the image is frozen, but the bonfire is animated.


And there are some other examples here.


It can also do video inpainting.


So, they took this source video here, masked out the side of it, and then it inpainted more balloons behind it.


Now, the newest version of Pika Pika 1.0 is capable of doing this kind of stuff, but again, this is just another level that we haven't seen yet.

最新版のPika Pika 1.0ではこのようなことが可能だが、今回もまだ見たことのないレベルだ。

Here's some more examples of inpainting, where there's the source video of this woman spinning around, and they were able to select just her body up here and give different prompts: wearing a gold strapless gown, wearing a striped strapless dress, wearing a purple strapless dress, wearing a black strapless gown.


You can see the videos pretty much stay identical, other than the clothes that she's wearing at the moment.


This is still just a giant tease.


Google showed it off, basically told us, Look what's coming with AI video, but then there's really no platform to access it through.


And knowing Google, I doubt it's going to be open source.


We'll just have to see what Google tool this tech ends up in and how we'll get to play with it.


But it's looking really, really impressive.


The distance we've come in the world of AI video between this time last year and now is just mind-blowing, in my opinion.


And while we're on the topic of AI video, here's some more research that we don't necessarily have access to.


Don't worry, I'm going to show off some tools we have access to, but this is some of the coolest research that's been coming out.


And these are a little bit of a sneak peek into the future of what we're going to get with some of our AI tools real soon.


So, this one's called ActAnywhere, and it's from Stanford and Adobe.


And what's really cool about this is they isolated a subject moving, then they gave it a background.


And then, when they combined them, the background actually moves along with the animation.


So, it knows that this is a woman running, and then the background is moving at a similar pace to what she's running at.


We can see some more examples down here.


Here's the original video of a duck just kind of swimming around in a pond.


You can see this is not used as a model input.


They just segmented out the duck to get this little piece of the duck without the water.


They added an image of a bonfire with a duck in front of the bonfire, and then it followed the same animation pattern from up here and combined it with this video of the fire.


Here's another one of Like a Surgeon or someone getting ready.


You can see here's the segmented out video, and then here's the final output from this background where the background sort of moves with the animation.


It's sort of aware of the animation and then makes sure that it all lines up and is congruent.


And again, lots of really cool examples.


Somebody kite surfing here, somebody running along the beach, and now over here they're running along a lake.


And instead, somebody on a jet ski got transferred to a horse.


Just some really, really cool tech that we're seeing come out of this.


Again, I'll make sure this is linked up below so you can see all of the examples here as well as learn about the actual research if you want to learn more.


And since we're on the topic of AI vide, some of you may have seen this already, some of you may have not, but Kanye West just released a new song and pretty much the entire video looks like it was all generated with AI.


I'm not going to get into too much commentary around this and how I feel about Kanye and all that sort of thing, but it is cool to see more of this AI generation kind of becoming mainstream.


It doesn't even appear he used some of the best, most recent AI video generation tech, but it still looks pretty cool and it's interesting again to see someone as mainstream as Kanye starting to use this tech.


I'm sure he's probably getting a ton of backlash for it too.


Let you finish.


Next up, let's talk about using AI to grow your LinkedIn account.


It's actually really easy to do with today's sponsor, Taplio.


Taplio understands your LinkedIn account and the types of posts that work well for you, and it will use GPT-4 to actually write brand new AI-generated posts that are likely to get engagement for you.


And that's just one of the many AI features they have.


They also have a post generator feature where you actually tell it what you want it to write a post about.


For example, how robots will do all of our work for us in the future, and click generate.


And just like that, it uses GPT-4 to write a on-topic post for LinkedIn.


I like this post.


I can either add it to Q or click edit and post, make whatever tweaks I want.


And then, post it right away or add it to my queue.


They've also got a really cool hook generator.


So, for example, maybe you're trying to write a post about seven ways AI will change the world.


Click generate Hooks and there we go.


Five Hooks that we can use to start off the perfect LinkedIn post.


They also have a new Carousel Generator.

新しいCarousel Generatorもある。

It even lets us use YouTube videos to generate carousels.


So, let's grab a link from one of my recent videos here, select YouTube as the option, and let's generate a carousel.


See what it comes up with for us.


And look at that, unveiling CES 2024, a journey into tomorrow's technology.

CES 2024のお披露目、明日のテクノロジーへの旅。

And you can see it created a whole Carousel based on the various topics that I touched on in that video.


They've also got an AI content repurpose tool.


I can plug in an article or YouTube video.


Let's just go ahead and plug in that same video here and click generate.


And watch as it generates a post for LinkedIn that's super relevant to the video I just talked about.


And there we go, we got a post that's a nice little recap of my experience at CES.


And these are just the AI features that I'm showing off.


There are so many other features inside of Taplio.


Here, if you're trying to get your LinkedIn game on point, but maybe you're struggling with what to write about, Taplio is the tool for you.


You can learn more by checking out Taplio.com.


They have a free 7-Day trial and a 30-day money back guarantee when you sign up.


And just to keep things easy, I will make sure the link is in the description.


Thank you so much to Taplio for sponsoring this video.


It's kind of motivating me to get back on top of my LinkedIn game.


Here's something that's really cool that's recently rolled out into beta access.


I managed to get my hand on the beta access of it, and it's an animation tool from Unity called MuseLab.


And it allows you to do a text to a sort of human animation.


And then, once the animation is created, you can tweak it a little bit.


So, if I go ahead and launch this app here, you can see over on the left, I have a prompt box.


It says describe a motion here, then press generate.


So, let's just do jumping jacks and click generate.


And you can see it attempted to make an animation of this character doing jumping jacks.


Now, it's not perfect, but I can clean it up if I want.


I can click on make editable up at the top.


You can see it actually gave me two key frames.


So, all it's really doing is animating between these two key frames.


So, if I want to tweak it a little bit and dial it in so it looks more like a jumping jack, I can do that.


I can actually grab the arm here, this arm here, move it up.


Let's spread the legs a little bit like this.


And now, if I press Play, you'll see it'll kind of loop between these two.


Not perfect, let's go ahead and bring the arms all the way down, down here.


And now, let's animate it.


And we have a real rough person doing jumping jacks.


I can even add more key frames.


If I press this plus here, it will add my little dude here.


And I could let's just do something really wacky with his body.


Let's put his leg back here, this leg goes up here, their body is tweaked like that.


Now, if I animate it, it's going to animate between those scenes.


So, it starts with AI, you generate what you want the animation to look like with a prompt.


Once you've prompted it, it gets you kind of close to what you want.


And then, you basically set these key frames, and you can animate it however you want.


It's like a much more advanced version of Mixamo if you've ever used Mixamo.


But instead of selecting a pre-designated animation, you can prompt an animation and then dial in the animation to exactly what you want.


Again, this one's in beta, but if you have a Unity account, I believe you can apply, and they are rolling it out to people now because I got access, and I have zero prior relationships with Unity, so they're not doing me any special favors or anything.


So pretty sure that you can get access if you sign up for the beta.


I just don't know how long it takes to get approved into the Beta.


Google's had a handful of interesting announcements lately as well, including some new updates to Google Chrome.

Googleは最近、Google Chromeの新しいアップデートを含む、興味深い発表をいくつか行っている。

Now, if you're on the latest version of Google Chrome and you want to try some of these new AI updates, simply come up to the three dots in the top right corner, come down to settings, and then over on the left, you'll see an option for experimental AI.

Google Chromeの最新バージョンを使っていて、新しいAIのアップデートを試したい場合は、右上の3つの点をクリックし、設定まで降りてきて、左側に実験的AIのオプションがあります。

You click on this and turn this switch on, you can see we now have two more options: tab organizer and create themes with AI.


To apply the changes, I need to relaunch Chrome, so I'll go ahead and do that real quick, and we'll notice these new features.


The first one being smartly organize your tabs.


If I come up to one of my tabs up here at the top of my Chrome browser and right-click, you'll actually see a new button that says organize similar tabs.


I click on this and then click let's go, you can see that it's suggesting a tab group called AI news.

これをクリックし、Let's goをクリックすると、AI newsというタブグループが提案されているのがわかる。

Imagine that I create the group, you can see it added a new button up in the top left that says AI news, and it folds it all up into that tab.


If I click on AI news again, it unfolds all the AI news that it just wrapped up.


One of the other new AI features is you can create your own themes with AI.


To do that, we'll just open a new tab in Chrome.


Here, I've got a new button down in the bottom right corner that says customize Chrome.


If I click on change theme, I can now create a new theme with AI.


So let's pick a subject.


I'll go ahead and do mountains.


Let's do a cyberpunk style and an intellectual mood.


See what that does for us.


Click create.


Now I can select one of the options it gave me.


Let's go ahead and select this one.


And just like that, I have a new customized AI-generated Chrome theme.


The other new feature that they announced but haven't rolled out into Chrome yet is getting help drafting things on the web.


Whether you want to leave a well-written review for a restaurant, craft a friendly RSVP for a party, or make a formal inquiry about an apartment rental, this new feature will help you write with more confidence.


This appears to be coming within the next month.


In other Google News, Google is now using Gemini with their search ads.


Google now has this sort of conversational experience to create ads where you go back and forth in a conversation box to dial in the way you want your ads to look and sound, and it is now using Gemini to power that conversational tool.


Google also rolled out a new tool which will allow you to inject your own face, your own selfie into historical images and timelines.


You can take a picture and work yourself into the Roman Empire or wherever you want to put it.


It's a feature called Art selfie, and it's actually available inside of the Google Arts and Culture app, which is available on both iOS and Android if you want to play around with it.

それはArt selfieという機能で、Google Arts and Cultureアプリの中で実際に利用できます。iOSとAndroidの両方で利用できますので、ぜひ試してみてください。

Since we're on the topic of AI art, let's talk about Midjourney.


They did roll out some marginal new features.


You can now use pan zoom and very region with the V6 model.


If you remember, up until now, we were only able to use those features in version five and I think earlier versions, but I'm not sure, but definitely version five.


And when you upscale an image in version six, you only got the options to upscale and very subtle or very strong.


So if I create something like this, this robot wolf hybrid here, and let's say I want to upscale number four, I now have all of these options even though I'm using version six.


Just as a quick refresher, very region is essentially the inpainting feature.


So for example, I could select around the wolf's eyes like this and say make the wolf wearing sunglasses, and we get some variations of the same image just with sunglasses.


That's a feature that just became available in version six this week.


And in other AI art news, the tool Nightshade is now available for artists to use.


Now, I did talk about Nightshade in a previous news video months ago.


It's basically a tool that artists can sort of apply to their art, and if somebody tries to train on that art, it screws up the training data for all of the art models.


So the news about the tech is sort of older news, but the fact that it's now available is, well, this week's news.


The AI company ElevenLabs just raised $80 million in their series B, and this is significant because this actually puts the company at a valuation of over a billion dollars, making it another AI unicorn company.


And since we're on the topic of ElevenLabs, they also rolled out a new feature this week called Dubbing Studio.

ElevenLabsの話題ということで、同社は今週、Dubbing Studioという新機能を発表した。

With Dubbing Studio, you can upload an audio or a video file.

Dubbing Studioでは、音声ファイルや動画ファイルをアップロードすることができる。

It will automatically detect what language the file is in if there are multiple speakers.


And then, it will translate the file into whatever language you want it to translate into, and it will still match the same voice and inflection of the original audio, making it easier for anybody to dub any video or audio file into any language.


In a land where the sun scorches the Earth, you ain't from around here, are you, boy?


And since we're on the topic of AI audio generation, this news has been going around this week where a fake Joe Biden Robo call has been circulating, telling people not to vote but in Joe Biden's voice.


But if we listen to this Robo call, it's not super convincing.


What a bunch of mil!


We know the value of voting Democratic.


When our votes count, it's important that you save your vote for the November election.


We'll need your help in electing Democrats up and down the ticket.


Voting this Tuesday only enables the Republicans in their quest to elect Donald Trump again.


Your vote makes a difference in November, not just Tuesday.


If you would like to be removed from future calls, please press 2 now.


I mean, it obviously sounds like a fake AI-generated voice, but I guess some people fall for it.


OpenAI released some new updates this week as well, including lower pricing for GPT-3.5 turbo, as well as an updated version of GPT-4 Turbo.


Apparently, this new version of GPT-4 reduces cases of laziness where the model doesn't actually complete a task.


They also have some bug fixes in it.


They also made an update to their text moderation 007 model.


Also, this week, the news has been circulating that OpenAI is planning to set up chip factories.


As of right now, almost all of the GPUs that are being used to train these AI models are coming from NVIDIA, and companies like OpenAI really want to release their reliance on companies like NVIDIA.


Sam Altman, he's out there trying to raise about a hundred billion dollars for a network of chip factories spread across the globe.


But the problem that we're seeing now with training these AI models is less and less the GPUs and more and more the energy consumption it takes to run these GPUs.


Sam Altman actually recently said in a recent interview that AI needs some sort of energy breakthrough in order to continue to advance these AI models.


It was announced this week that Hugging Face and Google are partnering up for an OpenAI collaboration.

今週、Hugging FaceとGoogleがOpenAIコラボレーションのために提携することが発表された。

That isn't to say an OpenAI one word, but open AI.

これはOpenAIと一言で言うのではなく、open AIである。

Basically, Google wants to start making open-source projects with Hugging Face, similar to what Meta is doing.

基本的に、GoogleはHugging Faceとオープンソースのプロジェクトを始めたいと考えている。

And quite honestly, it sounds like a similar strategy and tactic to what Meta and Microsoft have going on together.


So if you look at Meta, they're trying to develop a lot of open-source models, and they're working with Microsoft to allow people to use the Microsoft Azure Cloud to help develop these models.

Metaを見ると、彼らは多くのオープンソースモデルを開発しようとしており、Microsoftと協力して、人々がMicrosoft Azure Cloudを使ってこれらのモデルを開発できるようにしている。

Well, Hugging Face and Google have seemingly a similar partnership.

Hugging FaceとGoogleも同様のパートナーシップを結んでいるようだ。

Hugging Face is a place where a lot of these open-source models are made available for people, and they're going to be working with Google to be able to process a lot of this stuff inside of Google's Cloud.

Hugging Faceは、こうしたオープンソースのモデルの多くを人々が利用できるようにした場所であり、Googleと協力してGoogleのクラウド内でこうした多くのものを処理できるようにしようとしている。

It's just a fascinating world that we're in right now where we look at Google, and Google is almost mirroring what other companies are doing.


Obviously, OpenAI launched ChatGPT and DALL·E 3.

明らかに、OpenAIはChatGPTとDALL·E 3を立ち上げた。

And then they worked with Microsoft to get it inside of the search engines.


Google had their code red Red Alert thing or whatever they did and started rolling out Bard and the generative search experience.


They also made available their image in image Creator inside of Google.

また、Googleのimage Creatorで画像を利用できるようにした。

Then we saw Meta work with Microsoft.


Meta was developing open-source Large Language Models as well as other AI tools, and they teamed up with Microsoft to use their clouds to help normal people use some of these models.

MetaはオープンソースのLarge Language Modelsや他のAIツールを開発しており、Microsoftと提携し、Microsoftのクラウドを利用して、一般の人々がこれらのモデルを利用できるようにした。

Now Google is collaborating with Hugging Face to help make it easier for other people to use these open-source models.

そして今、GoogleはHugging Faceと協力し、他の人々がこれらのオープンソースモデルを使いやすくする手助けをしている。

So, just a very interesting dynamic that we're seeing.


Microsoft make these smart moves and then Google kind of following behind, going, That was a smart move, let's do something similar.


Now, I don't know if that's really what's happening, that's just what it feels like from the outside perspective.


This week, Apple podcast started automatically transcribing podcasts.


So, if you're using their podcast player, you should now be able to look at an AI-generated automated transcript for any podcast on the platform.


And finally, this isn't AI news, but it's interesting nonetheless.


Disney is rolling out this platform where you could sort of walk in place, and it's designed to be able to walk around in a metaverse in virtual reality, feel like you're walking, and you're actually just standing in place.


And it works with multiple people on it.


However, all of the demos that I've seen, they look like they're walking in slow motion.


If you're going to be playing a shooter game, if you're going to be playing Call of Duty inside of virtual reality, you're going to need to be able to run on this thing.

シューティングゲームや、バーチャルリアリティの中で『Call of Duty』をプレイするのであれば、この上で走れる必要がある。

So, it'll be interesting to see where they take this thing, but pretty cool tech nonetheless.


And before I wrap up, I actually want to let you know that I am going to be speaking at an event called Content Hacker.

最後に、私がContent Hackerというイベントに登壇することをお知らせしておきます。

This is going on during South by Southwest out in Austin on March 12th and 13th.


And on that first night, on March 12th, we're doing a VIP experience where I will be on a panel with some other really, really smart people, and we'll be talking about the future of AI, content marketing in AI, and really just anywhere we want the discussion to go.


It's going to be a really, really fun evening, and there's going to be snacks in a bar and all that good stuff.


So, it should be a really, really good time.


Again, it's coming up March 12th through 13th, and I wanted to let you know now because during South by Southwest in Austin, hotels sell out really, really fast.


If it is something you're interested in going to, you probably want to start planning for it now.


I'll make sure it's linked up in the description below.


It's going to be a fun evening, and I couldn't be more excited to be a part of it.


It's going to be my first South by Southwest.


I've never been before, so really, really looking forward to it.


So, check it out, contenthacker.com.


The link will be in the description, and that's what I got for you today.


If you love nerding out about AI, AI tools, AI news, all of that kind of stuff, check out futurtools.io.


I keep this site up to date on a pretty much daily basis with all the coolest AI tools I come across.


I keep the AI news page up to date on a daily basis, and I have a free AI newsletter where I share just the most important pieces of AI news and just the coolest AI tools of the week.


And when you sign up for the free newsletter, you get access to the AI income database, which is a whole database of ways to make money using AI.


Completely free.


You can find it all over at futurtools.io.


And finally, if you like videos like this, give this one a thumbs up and consider subscribing to this channel.


I will make sure more videos show up like this inside your YouTube feed.


Thank you so much for hanging out and nerding out with me.


I really, really appreciate you.


Lots more cool AI nerdery to come on this channel, so make sure you subscribe so you don't miss it.


I'd really appreciate it.


And thanks again to Taplio for sponsoring this video.


You guys rock.


All right, I'm done.


See you later.



