AI news has actually been kind of slow so far for the last couple weeks.


We had CES where a whole bunch of big announcements were made and a whole bunch of cool products were revealed, and then it was just kind of quiet.


There wasn't a lot to talk about.


However, things are really starting to pick up steam again, and as we roll into February here, there's really been some kind of exciting new developments and tools that we get to play with now.


So let's jump right in.


Starting with the fact that we can now generate images directly inside of Bard.


Bard is finally catching up with ChatGPT, at least in the ability to generate images directly within the chat platform.


If we take a peek inside of the bard.google.com/updates pages here, you could see on February first, they rolled out the update where we can now create images with Bard.

bard.google.com/updates のページをここでちょっと覗いてみると、2月1日にアップデートが行われ、私たちは今、Bardを使って画像を作成できるようになったことがわかります。

Now, it is important to keep in mind that it's not yet available in all countries.


According to the support docs, it says image generation in Bard is available in most countries, except in the European economic area, Switzerland, and the UK.


And right now, it only works with English prompts.


But if you want to start generating images within Bard, it's simple.


Just head on over to bard.google.com.


You'll get the chatbot that you're used to directly in Bard, and you just tell it what you want from it.


For example, generate an image of Mario kitesurfing.


This time, it generated just one image.


Usually, it generates two images.


Not sure why it only generated one, but it does give you this button to generate more.


And after a few seconds, it'll generate more.


Once again, it only generated one instead of two more, but those are clearly Mario kite surfing.


Now, one thing that I've noticed when using Bard is it denies a lot of prompts, like prompts that it doesn't seem like it should be denying.


When I posted about it on X today, that was pretty much the number one complaint people had about it.


Cyris here gave an example where he says, I want images of a brown cow drinking chocolate milk in a field.


And it said, Unfortunately, I can't generate images that are physically impossible or misleading, like a cow drinking chocolate milk. Cows are lactose intolerant, meaning they can't digest milk properly. While the image is a fun thought experiment, it wouldn't be accurate to reality.


Like, what if I try to ask it to generate an image of any sort of real person, like generate an image of Tom Hanks? It says, I can't generate images of that. Try asking me to generate images of something else.


Yet, when I ask it to generate something that's clearly trademarked IP, like Sonic the Hedgehog, it has no problem with that.


Very hit or miss on what it will actually let you generate.


There's been some really random prompts that it should have no problem generating, nothing offensive about them, and it just says it can't for some reason.


My buddy MattVidPro actually made a video all about what Imagen 2 is capable of, and he was getting a lot of the same results in his video where he was asking to create images with a human with eight arms and things like that, and it was saying, I can't generate that. That's outside of our policies or whatever.

私の友人であるMattVidProは、実際にImagen 2が何ができるのかについてのビデオを作成しましたが、彼も同じ結果を多く得ていました。彼は8本の腕を持つ人間の画像を作成するように頼んだりしていましたが、それに対して「それは私たちのポリシーの範囲外です」と言われました。

I also learned from this video from MattVidPro that you can generate images directly inside of Google's Test Kitchen, inside of their tool called ImageFX.

私はまた、MattVidProのこのビデオから学んだことですが、GoogleのTest Kitchen内でも直接画像を生成することができるということです。それはImageFXというツールの中にあります。

Google's AI Test Kitchen is where they release a lot of their AI projects early, and they allow testers to come in and play around with them before they are pushed out to the whole world.

GoogleのAI Test Kitchenは、彼らが多くのAIプロジェクトを早期にリリースし、それらを世界中に広める前にテスターが試して遊ぶことを許可している場所です。

And if I go to Test Kitchen and scroll down, I can see that we have ImageFX here, and this is a purpose-built image generator.

Test Kitchenに移動してスクロールダウンすると、ここにImageFXがあります。これは特別に作られた画像生成ツールです。

Obviously, Bard is a chatbot with image built into it. ImageFX appear to be a user interface solely for the purpose of generating images with Imagen 2.

明らかに、Bardは画像を組み込んだチャットボットですが、ImageFXはImagen 2を使用して画像を生成するためのユーザーインターフェースのようです。

I'd like to say it gives you a little bit more control than trying to generate the images directly in Bard, but really the only additional options it gives you are the ability to set a seed down here, a I'm feeling lucky button which will generate a random image for you, such as a turtle made of potato chips.

直接Bardで画像を生成しようとするよりも、少しはコントロールできると言いたいですが、本当に追加されるオプションは、ここでシードを設定することができる能力と、ランダムな画像を生成する「I'm feeling lucky」ボタンくらいです。例えば、ポテトチップスでできた亀のようなものです。

Actually pretty good.


And it's got these extra little prompt helpers down here, so I can add things like handmade 35mm film.

また、ここには追加のプロンプトヘルパーもあります。例えば、handmade 35mmフィルムなどを追加することができます。

So it gives some suggestions to improve the prompt, and under each suggestion, it also gives you a little drop-down, so instead of made of, you can try sculpted out of, painted on, drawn on.

プロンプトを改善するための提案がいくつかあり、それぞれの提案の下にはドロップダウンがあります。made ofの代わりにsculpted out ofやpainted on、drawn onなどを試すことができます。

Instead of handmade, digital, painted, sculpted.


Instead of 35mm film, we've got some other options here.


But other than letting you set the seed and giving you some sort of prompt helpers, there's not really much more control.


But it does generate four images at a time instead of two without you needing to press the generate more button.


And if you're curious what this prompt looks like, a turtle made of potato chips, handmade 35mm film, there you go.


I actually thought the original was better.


One thing the other Matt figured out in his video was that Imagen is really good at sort of photo-realistic images, almost Midjourney version six level, but not quite as creative and also not quite as prompt adherent as DALL·E 3 inside of ChatGPT.

他のマットが彼のビデオで見つけたことの一つは、Imagenが写真のようなリアルな画像を生成するのが得意であること、ほぼMidjourneyバージョン6レベルですが、DALL·E 3のように創造的でなく、またPromptに忠実ではないことです。

And to sort of pile on the negatives, he also found that a lot of random prompts that should have worked, it told him it couldn't generate for some reason.


So I don't know, play around with it yourself, have some fun with it, see what you can generate.


It's free to use right now.


What's not free to use is Midjourney, and Midjourney just rolled out a brand new AI image generation model.


If you're into generating cartoon and sort of anime images, they just rolled out the Niji V6 model, which generates these sort of anime-style images.

もしカートゥーンやアニメの画像を生成するのが好きなら、Niji V6モデルが導入されました。これはアニメ風の画像を生成します。

If you have a Midjourney account, you want to play around with this, jump into Midjourney, type/settings, click on the model drop-down here, and we can see Niji model 6 Alpha.


Select that, dismiss the message, and now when I generate a prompt, it will prompt it in this Niji mode.


I could prompt something like a human cyborg battling a human ninja, and you can see it added niji 6 at the end.

人間のサイボーグが人間の忍者と戦っているようなものをプロンプトすることができ、末尾にniji 6が追加されるのがわかります。

And alternatively, if you don't want to change it in the settings, you just want to generate one image, you can also add this --niji 6 to the end of your prompt to force it into this Niji mode.

また、設定を変更せずに1つの画像を生成したい場合は、プロンプトの末尾に--niji 6を追加して、これをNijiモードに強制することもできます。

And here's what we get when we generate a cyborg battling a ninja.


While we're on the topic of AI image generation, Shopify is rolling out the Magic AI Image Editor, and here's what that does, using Shopify's own words.

AI画像生成の話題についてですが、ShopifyはMagic AI Image Editorを導入しており、これが何をするか、Shopifyの言葉で説明します。

Starting today, we are rolling out new Shopify Magic features in the media editor that use generative AI to make product image editing and enhancement easier than ever, right here in the admin.

本日から、私たちは管理画面内で、製品画像の編集と向上を今まで以上に簡単にするために、生成AIを使用した新しいShopify Magic機能を展開しております。

With these new AI-powered background editing features, you can select from common style presets for an instant boost of professionalism or reimagine the scene altogether with a short description of what you'd like to see or instantly match the background of your existing photo shoots to make your storefront feel more on brand and cohesive.


Now, this next thing isn't something we actually have access to yet, but I thought it was pretty cool.


Microsoft released some research called StrokeNUWA or StrokeNUWA and basically, it's a new way of sort of prompting images, more like prompting drawings, I guess, where it turns certain line shapes into vectors in a vector database.


And then, you prompt it using a Large Language Model.


It's sort of like generating AI images in the same way you generate text with something like GPT-4 and ChatGPT.


Instead of a diffusion model or a GAN (generative adversarial network) model like what the other image generators use, again, this is still early research.


We don't really have access to play around with this yet, but there are some cool examples of the type of images this will be able to generate.


And yes, they kind of look like children's drawings right now, but we're basically asking a Large Language Model that's similar to a ChatGPT-type model to actually draw pictures for us and not generate them through the standard diffusion models.


It's a completely new approach to generating these images, and I'm really excited to see how this plays out and how this improves in the future.


Here's some more interesting research that was recently made available on Hugging Face for us to actually play with.

ここに、最近Hugging Faceで利用可能になった興味深い研究のいくつかがあります。

It's called Image to Sound FX or Image to SFX.

それは「Image to Sound FX」または「Image to SFX」と呼ばれています。

You upload an image, submit the image, and it will generate a little sound clip to go along with that image.


And that audio was generated with nothing more than the image that was uploaded.


This week, ChatGPT rolled out a brand new feature where you can actually call upon GPTS in your chat and have multiple GPTS all inside of a single chat.


I actually did an entire video breakdown of this new feature.


You can find it by looking for this video right here.


But basically, when I'm inside of the ChatGPT user interface, I can type @, and it will bring up a list of all of the various GPTS I've used recently.


I can select one, prompt it with something like What's the most recent paper about AI?, it gives me a response to that.


And without ever leaving this chat window, I can call upon another GPT.


For example, the diagrams show me GPT and tell it to create a diagram based on the last response.


Just like that, it generated a diagram for me, all in the same flow and the same chat as the first prompt that I gave to Consensus.


So, you can actually start to tie and blend various GPTS together to get really cool final outputs and do some really interesting things inside of ChatGPT.


But again, I have a full breakdown video of it.


It looks like that while we're on the topic of Large Language Models, something really interesting happened this week.


A new open-source Large Language Model appeared on Hugging Face for people to use, and it was giving outputs that were near GPT-4 level outputs.

人々が使用できるようになった新しいオープンソースの大規模言語モデルがHugging Faceに現れ、GPT-4レベルの出力を提供していました。

The model was called Miqu (MIQU), and according to this post from N8 Programs here, it gets an 83.5 on the EQ Benchmark, surpassing every other Large Language Model in the world except for GPT-4.

そのモデルはMiqu(MIQU)と呼ばれ、N8 Programsのこの投稿によると、EQベンチマークで83.5を獲得し、GPT-4以外の世界の他のすべての大規模言語モデルを上回っています。

It even beats Mistral Medium, which is supposedly ML's best model right now.

それは現在のところMLの最高のモデルであると言われているMistral Mediumさえも上回っています。

Now, when this model appeared on Hugging Face, there was a lot of speculation that this was actually a leaked version of the Mistral Medium model.

このモデルがHugging Faceに現れたとき、これは実際にはMistral Mediumモデルのリークバージョンである可能性が非常に高いという憶測がありました。

That's kind of a tongue twister.


And that possibly Mistral themselves leaked it.


And when people started doing tests with it, comparing it to the Mistral Media model that's available on Perplexity, it was pretty much giving the exact same outputs, which all but confirmed that it was a leaked version of this Mistral Medium.

そして、人々がそれをテストし始め、Perplexityで利用可能なMistral Mediaモデルと比較すると、ほぼ同じ出力を提供していました。これにより、これがMistral Mediumのリークバージョンであることがほぼ確定しました。

Well, this week, the Mistral CEO then confirmed that the leak of this open-source AI model is actually one of their Mistral models.


In this article, it says, Today, it appears we finally have confirmation.


Mistral co-founder and CEO Arthur Mensch took to X to clarify,An overenthusiastic employee of one of our Early Access customers leaked a quantized and watermarked version of an old model we trained and distributed quite openly.


We retrained this model from LLaMA 2 the minute we got access to our entire cluster.

私たちは、私たちのクラスタ全体にアクセスできるようになった瞬間から、このモデルをLLaMA 2から再トレーニングしました。

The pre-training finished on the day of Mistral 7B release.

事前トレーニングはMistral 7Bリリースの日に終了しました。

We've made good progress since.


Basically, all of this to say that Mistral took a version of LLaMA 2, sort of beefed it up, made it better, started getting near GPT-4 results.

基本的に、これらすべてを言いたいのは、MistralがLLaMA 2のバージョンを取り、それを強化し、より良くし、GPT-4の結果に近づけ始めたということです。

And then somebody over on the Mistral team leaked it onto Hugging Face, and now it's accessible for people to actually play around with.

そして、Mistralのチームの誰かがHugging Faceにそれをリークし、今では人々が実際に遊ぶことができるようになりました。

According to the CEO, it's actually one that they trained a while ago, and they actually have better models now.


So Mistral probably has some game-changing AI in the works that could be pretty exciting.


While we're on the topic of open-source Large Language Models, this week Meta released their CodeLlama-70B model.


According to one of Meta's own posts here, CodeLlama-70B instruct achieves a 67.8 on human eval, making it one of the highest performing open models available today.


CodeLlama-70B is the most performant base for fine-tuning code generation models.


And if you don't know what that means, don't worry, I didn't really either.


But here's a post from Philipp Schmid here on X basically explaining it.


As CodeLlama-70B was able to reach initial GPT performance, so when GPT-4 first came out, it was pretty dang good at code, quite a leap from GPT-3.5.

コードLLaMA 70bは初期のGPTのパフォーマンスに達することができたため、GPT-4が最初に登場したとき、コードにおいてはかなり優れていました。GPT-3.5からの大きな飛躍です。

We do now have GPT-4 turbo, which is probably a little bit better at writing code, but this CodeLlama-70B is about as good as that initial GPT-4, and it's open source.

現在、GPT-4 turboがありますが、おそらくコードの書き方においては少し優れていますが、このCodeLlama-70Bは初期のGPT-4とほぼ同じくらい優れており、オープンソースです。

Anybody can use it and anybody can fine-tune it to their own coding needs.


Commercial uses allowed, so you can build your own software with it and resell it if you want.


There's also been a lot of talk lately around Apple and them getting deeper and deeper into the AI game.


Now this article here on qz.com is actually from September of 2023, so this may actually have changed a little bit since this article came out.


But if we look at this graph, Apple has actually acquired more AI companies than pretty much any other company.


We can see since 2017, Apple had acquired 21 AI companies versus 19 from Ascenture, 12 from Microsoft, 11 from Meta, and then Alphabet's all the way down here at 8.


So although Apple doesn't talk about AI quite as much as the rest of these companies, they're quietly accumulating all of these AI companies.


And this week on February 1st, Tim Cook confirmed that Apple's generative AI features are coming later this year, saying that they're putting tremendous time and effort into integrating AI into its software platforms.


Now Apple sort of comically never mentions AI in any of their keynote presentations, but in a recent earnings call with Tim Cook, he apparently mentioned generative AI several times, but never really actually got too specific.


So we don't actually know what that means.


We don't really know what sort of generative AI is going to be added into Apple products.


But the biggest speculation right now is that we're going to get a new version of Siri, most likely announced at this year's WWDC event.


But Tim Cook was very hush-hush on exactly what AI is rolling out, so we're just going to have to wait and see.

しかし、Tim CookはAIが展開される具体的な内容について非常に秘密主義であり、私たちはただ待つしかありません。

And since we're on the topic of Apple, this week is a fairly historic week for Apple.


They are launching a brand new product, a brand new gamble for their company with the Apple Vision Pro.

彼らはApple Vision Proという全く新しい製品を発売し、会社のための新しい賭けになります。

It seemed to get into a lot of other YouTubers' hands, not mine, but it's going to be available for general consumers starting this week.


In fact, the day that this video goes live is the day that most people who pre-ordered the Apple Vision Pro is going to get access to theirs.

実際、この動画が公開される日は、Apple Vision Proを予約したほとんどの人々がアクセスできる日です。

I'm a huge sucker for new tech, so I ordered one, and I'm expecting to have mine hopefully by the time you're seeing this video.


And along with the Apple Vision Pro release, Apple did announce that more than 600 new apps that are purpose-built for the Apple Vision Pro are going to be available at launch.

Apple Vision Proのリリースと共に、AppleはApple Vision Pro向けに特別に開発された600以上の新しいアプリがローンチ時に利用可能になることを発表しました。

So they should be available when we get access to our Apple Vision Pros.

したがって、Apple Vision Proにアクセスできるようになると同時に、これらのアプリも利用可能になるはずです。

I'm not going to go into too much detail on the Apple Vision Pro in this video.

この動画ではApple Vision Proについて詳細には触れません。

Since by the time I make it another video, I will have had access to one.


I should be able to dive a little bit deeper into what I know about Apple Vision Pro at that point.

その時点で私がApple Vision Proについて知っていることをもう少し詳しく説明できるはずです。

This week, we also saw a new mobile browser called Arc Search.

今週、新しいモバイルブラウザのArc Searchも登場しました。

In fact, if you watch my other video, All About really cool AI apps available on mobile phones, which looks like this video right here, you've already seen me play around with Arc Search.

実際、私の他の動画「モバイル電話で利用可能な本当にクールなAIアプリについてのすべて」をご覧いただければ、Arc Searchを使って遊んでいる様子をすでに見ているかもしれません。

It's basically a browser where you can enter any prompt you want.


For example, I type San Diego Padres, I click browse for me.


And then, it generates a pre-built web page for me with everything I would want to know about the Padre's, including images of the stadium, details about the team, their home stadium, their achievements, the top search results, details about the franchise history, details about the current season, details about notable players, upcoming games.


It basically creates a single web page with everything you'd want to know, all on that web page, so you don't need to click around and browse different web pages to find what you're looking for.


If I'm honest, I think it's a really cool app.


I think this is sort of what end users want.


I don't think people want to have to dig through a whole bunch of websites to find the exact thing that they're looking for.


It's nice to just type in a topic and just get all of the details about that topic presented right in front of you.


However, it does sort of present a problem to content creators.


If I'm creating content and tools like this are just surfacing the information people are looking for, there's no incentive to click over to my website, which means I might miss out on ad revenue and the ability to build my mailing list and the ability to offer products on my website.


It kind of disincentivizes content creators.


While I like the concept and from a user standpoint, it does seem to make my life easier, but from a content creator standpoint, I have some concerns.


But I'm curious what you think.


Let me know in the comments.


This week, the New York Times announced that they're building a team to explore AI in The Newsroom.

今週、 New York TimesはニュースルームでAIを探求するためのチームを結成すると発表しました。

If you remember, the New York Times is one of the companies suing OpenAI right now for scraping content from The New York Times, but they're putting together their own internal AI team so that they can explore the genre of AI and what it could possibly be used for within the New York Times.

覚えているかもしれませんが、 New York Timesは現在OpenAIを訴えている企業の一つであり、 New York Times内でAIのジャンルを探求し、それが New York Times内でどのように使用される可能性があるかを調査するために、独自の内部AIチームを結成しています。

They do, however, say that journalists will still write, edit, and report the news.


If I had to guess, it sounds to me like the New York Times wants to build a New York Times bot.

推測すると、 New York Timesは New York Timesのボットを作りたいのだと思います。

So instead of you prompting to find news inside of ChatGPT, if you want to find information specifically from The New York Times, you can go and ask the New York Times bot, and it will be trained on all the various New York Times articles and respond based on the information that's solely within the New York Times database.

つまり、ChatGPT内でニュースを検索する代わりに、New York Timesから特定の情報を検索したい場合は、New York Timesのボットに質問することができ、それはNew York Timesのデータベース内の情報に基づいて応答します。

That's sort of my guess of what's going on here.


Here's something that I don't think anybody's going to complain about, except for maybe scammers and spammers, but the FCC is moving to outlaw AI-generated robo calls.


Now, they're doing this to try to prevent and minimize scams and spam calls and things like the fake Biden calls.


But if I'm being honest here, bad actors are going to be bad actors, and even if it's illegal, they're just going to do it anyway.


So, I don't know if outlawing it is really going to have an impact on the people that are using this technology to scam others.


But pretty soon, it's going to be on the books that it's illegal to do that, although I already think most people see it as pretty highly unethical.


This week, we also got a robot that was trained to read Braille at twice the speed of humans.


The research team from the University of Cambridge used machine learning algorithms to teach a robotic sensor to quickly slide over lines of Braille text.


The robot was able to read the Braille at 315 words per minute at close to 90% accuracy.


Now, they say that the robot Braille reader was not developed as an assistive technology, and it was really designed as more of this like test to find out how good sensors are, almost like a benchmarking system to test future systems to see how well they can read Braille.


However, this article does go on to say that they are thinking about putting this technology inside of the fingertips of humanoid robots so that they can read Braille as well as into things like prosthetics.


But it does sound like the initial use case for this is more for robots than for humans.


In the last week, we learned about a new AI model called Morpheus 1, which claims to induce lucid dreaming.

先週、ルシッドドリーミングを誘発すると主張する新しいAIモデル「Morpheus 1」について知りました。

It's basically this headband that they call the Halo, and the Halo sends sound waves or ultrasound holograms into the brain to connect with the current brain state and combined can put the mind into a lucid state.


Now, if you're not familiar with lucid dreaming, lucid dreams are a type of dream where the dreamer becomes aware that they're actually asleep.


So when you're in the dream, you realize that you are asleep and in the dream, effectively giving you control over the dream.


The company also says they're going to start beta testing this thing in Spring of this year, 2024.


This is one of those technologies that I think is really cool, but I have a hard time sort of comprehending it working, if I'm honest.


Like, I would love to see it, but for some reason, I still feel very skeptical about it.


Hopefully, I could be an early beta tester on something like this because I would love to try it and be proven wrong and be able to have more control over my dreams because that just seems really cool.


And finally, this week, in some pretty major news, Neuralink from Elon Musk was implanted in the first human patient.


Now, the Neuralink itself is this little circular chip that's about the size of a quarter that's actually designed to be implanted in your skull and actually help your brain interface with a computer.


Now, the first use cases for this product are for quadriplegics to help them regain some sort of functionalities that they lost.


But I feel like it's only a matter of time before we start getting that like Matrix scenario where you could just download how to do kung fu into your brain.


