


We think that within maybe a year or two from now, like the models will be unrecognizable from what they are today.


This year, we also plan to push that boundary even more, and we expect our next Frontier model to come and provide like a step function in reasoning improvements as well.


There was actually a very, very interesting announcement that was made.


But including myself, a lot of people didn't actually realize what the announcement was.


It was something that was quite under the radar.


But in this video, I'm going to be showing you guys all of these secret announcements and some of these secret updates that are coming to OpenAI's models in the future.


In the very, very close future, including some of the key dates that were actually unveiled at a secret presentation.


One of the things that you can see here is this is an image that has been floating around on the internet for the past 24 hours.


I've confirmed that this image is from the Viva Technology event, which is commonly known as VivaTech.

この画像は、一般的にVivaTechとして知られるViva Technologyイベントから来ていることを確認しました。

It's an annual technology conference dedicated to innovation and startups.


It's held in Paris, in France, and was founded in 2016 by the publicist group.


The event takes place at the Paris Expo.


It's pretty much the, or I should say, one of Europe's largest tech and startup events.


It's mainly focused on tech innovation along with some other key business insights.


From this image, it seems pretty simple, but there was something that a lot of people did miss.


We can see here that on the image that you're currently looking at, we have three main points.


What we do have is we have a situation where we have the 2021, which is the GPT-3 era, the DaVinci model.


This was of course in 2021.


You can see now, and this is actually super, super interesting.


This is why I think this video is remarkably important.


You can see that we moved in 2023 to the GPT-4 era, which is here.


You can see that this is around the time of GPT-4 being released and being deployed.


What's very, very interesting is that it now shows a new piece of information that I would say is a little bit interesting.


You can see that they describe this as a GPT-Next, okay?


I think that maybe GPT-5 might not be coming.


When I say that, I don't mean GPT-5 isn't actually coming.


What I mean is that GPT-5, as you think of it, I think OpenAI are most likely planning a lot more than people think.


Of course, that is something that we even recently saw with the recently demoed GPT-O. So it's crazy because it shows us that May 2024, which is, as you can see here today, and you can see it says May 2024.

もちろん、最近デモされたGPT-Oでも見られたことですが、それは私たちに2024年5月を示しています。ここで今日見ることができるように、そしてMay 2024と書かれています。

But what's crazy is that they've actually given us the release date for this GPT-Next model.


You can see that if you look at this.right here, we can see that this is actually November 2024.


This date is very important for a few reasons, which I'm going to explain in a moment.


But I think one of the first things that really, really surprised me was, of course, the fact that this is called GPT-Next and not GPT-5.


What's really, really crazy about this is that the craziest thing is that we can see that there is clearly some kind of increasing capability.


It is very, very hard to see this, but on the left-hand side here, it says model intelligence.


You can see that GPT-3's intelligence was around this level, GPT-4's intelligence, it doesn't really benchmark it, but I'm guessing that this is GPT-4o.


You can see that there is a slight improvement there, although the improvement is slight.


The thing that you need to take into account is the fact that even if improvements are slight, it does mean that a lot of use cases are going to be pretty, pretty insane, because if the model can get smarter and it can become more reliable, then it means the industries that it can impact are going to be a lot more overall.


I think one of the most important things that we could see here is that the model intelligence from GPT-4o or GPT-4 completely does a huge, huge jump.


We can see that literally from this level to this level, we can see that it's that amount of jump.


But from here, it is a quite big jump.


In fact, I probably should be using some actual arrows, apologies for my terrible drawings.


But the point I'm trying to make here is that it seems that the kind of jump that we're about to get here with this GPT-Next model looks really, really surprising.


It's something that they've constantly reiterated that these future models are going to be very, very intelligent in terms of how smart these systems are.


Whilst there might be other features, this is something that we do know.


One thing that I did want to talk about about this GPT-Next model, because of course, they could have simply put that this is going to be GPT-5, although of course, they don't want to officially announced it.


It could just be a placeholder for GPT-5.


I think that it might not be GPT-5, but I'm gonna dive into that one second.


But one thing I want you guys to know is that this release date right here of November 2024 is a key date because this makes sense for the release date of the NEXT model, whether it's GPT-5 or whether it's GPT-Next, whatever other models there are.


I think it's important to note that this date has been said by OpenAI a few times.


One of the key things coming up this year, and I know some people don't live in America, so you might not pay attention, but there are the 2024 United States elections.


This is gonna be taking place on Tuesday, the 5th of November, and you might be thinking, but those are the elections.


What does that have to do with actual AI systems?


I mean, that's politics, this is technology.


In fact, those things are very, very closely intertwined because OpenAI themselves did actually make a statement regarding this, and the elections are actually a reason for the delay of GPT-5, as many people did think that GPT-5 was scheduled to be released in the summer.

実際、これらのことは非常に密接に関連しています。なぜなら、OpenAI自体がこれに関して声明を出しており、選挙がGPT-5の遅延の理由であると多くの人が考えていたように、実際に選挙はGPT-5のリリースに大きな影響を与えたとOpenAIのCTO ミラ・ムラティが最近確認したことがわかります。

However, you can see right here, OpenAI CTO Mira Murati recently confirmed that the elections were a major factor in the release of GPT-5.


We will not be releasing anything that we don't feel confident on when it comes to how it might affect the global elections or other issues, she said last month.


Whilst we did just get a pretty, pretty crazy demo of GPT-4o, a multimodal AI that just completely, completely shocked the industry, it is pretty, pretty incredible that you can see here that OpenAI are really, really concerned with as to what the future models are going to be able to do in regards to the election.


I think it's gonna be either one of two things.


One of the things is that because there is an election coming up, there are always different discussions on what could happen and the kinds of conversations going on around privacy issues and just a million different conversations that are going to be had.


The problem is that if OpenAI does release a model before the elections, then you could face a negative PR situation.


Like the public could be negatively thinking about OpenAI.


Of course, yes, this week OpenAI have had a huge, huge, huge amount of bad news in their favour due to some of the things that have been going on at the company from people leaving to Sam Altman doing some questionable things, depending on where you stand.


I think it's important to not release models during that time because it's at a time where if the technology is as truly advanced as the graph shows us, then it's definitely going to be more widely received as a model that is something that is threatening the individual democracy of the United States because if it has the ability to influence people, then some individuals might say that this was all timed.


With politics, things do get really, really difficult, really, really quickly.


I think this does make sense.


I could be wrong, but the fact that we do have the OpenAI CTO stating that they're not going to be releasing anything and the fact that of course GPT-Next is in quotation marks here and that there's going to be that at November, 2024, just after the elections.


I think November, 2024, and considering that previous rumors around that time also did say that, I think that this also does make sense.


Here's where he actually talks about the GPT-Next models.


This is a very, very fascinating clip and there is also this graph right here.


It is very, very, very hard to see, like very, very, very hard to see, but if you zoom in, you can see that there is also this graph.


You can see that GPT three era, GPT-4, GPT-Next, and you can see that this one doesn't actually have the dates on it.


I'm guessing that before when they had the dates, that may have just been a mistake based on proprietary information.


But of course, now the information is out there, although we don't know what date it's going to be.


We know that after November the 5th, up until the end of November, there's probably likely going to be some kind of model.


But anyways, we are really excited for this, but I'm going to show you guys what he talks about here because it's, there are four investments areas I'd like to cover.


The first key priority that we have is textual intelligence.


Our core belief is that if we increase textual intelligence, that will unlock transformational value in AI.


You can see on the screen here, these are the two major models that we offer today.


GPT-4, the best model with native multi-modality that we just showed, and GPT-3.5 turbo 10X cheaper, which is convenient for simple tasks where what you need is really things like classification or very simple entity extraction.

GPT-4は、私たちがちょうど示したネイティブのマルチモダリティを持つ最高のモデルであり、GPT-3.5 turboは10倍安く、分類や非常に単純なエンティティ抽出など、簡単なタスクに便利です。

We really expect that the potential to increase the LLM intelligence remains huge.


Today we think models are pretty great.


They're kind of like first or second graders, they respond appropriately, but they still make some mistakes every now and then.

それらはまるで一年生や二年生のようで、適切に反応しますが、時折まだミスを comit します。

But the cool thing that we should remind ourselves is that those models are the dumbest they'll ever be.


You know, they may become master students in the blink of an eye.


They will excel at medical research or scientific reasoning.


We think that within maybe a year or two from now, like the models will be unrecognizable from what they are today.


This year, we also plan to push that boundary even more, and we expect our next frontier model to come and provide like a step function in reasoning improvements as well.


The second investment area for us is to make sure the models are cheaper and faster all the time.


We know that not every use case requires the highest level of intelligence.


That's why we wanna make sure that we invest.


You can see here on the screen, the GPT for pricing and how much is decreased by like 80% in just a year.


It's quite unique, by the way, for a new technology to like decrease in price so quickly.


But we think it's like really critical in order for all of you to build and reach scale with what you're trying to accomplish and innovate with your AI native products.


I think that that short snippet from this tech conference was rather insightful because he actually said a numerous amount of different things in that short snippet, but I think some of them were more important than others.


Of course, he talks about the price decreasing, but one of the things that he did mention that was rather, rather fascinating, and this is someone that is from OpenAI, he actually speaks about the fact that literally within one to two years, the models are going to be unrecognizable.


That is something that even as someone who pays attention to the AI space, and as someone who looks at all of the AI updates in many different things that I literally don't even post on this channel, this is still something that is rather surprising.


I think it's because humans do have a hard time at grasping the nature of exponential increases in terms of technology and intelligence.


I think this is going to be a truly, truly transformative period in terms of what is going to come out of this company within the next five to 10 years, because if he's stating that literally the models are going to look unrecognizable within one to two years, I mean, in 2026, this isn't far away.


Two years is a very short time period, especially for these kinds of technological developments.


Something that he also said that I thought was also rather insightful was that he mentioned a step function in reasoning.


For the next models, this likely means, as we've already discussed, that this is a significant, discrete improvement in the AI's reasoning capabilities rather than a gradual incremental improvement.


Essentially this just means that, in contrast to the gradual improvement, that a step function implies a sudden, substantial improvement at a particular point, which is followed by a new level of capability.


This change is more abrupt and significant compared to the gradual improvement.


With the reasoning abilities, current models like the GPT-3 and GPT-4 have, of course, made significant strides in their ability to reason and understand and generate text.


But their abilities to reason can still be limited in certain contexts.


The GPT-Next models means that a step function in reasoning could mean that they make a substantial leap in their ability to understand, process, and generate more complex, abstract, and logical forms of reasoning.


Because of that increased level of reasoning, it means that they've got improved problem solving.


This means that these models are going to be better at tackling complex problems that require multi-step and logical reasoning.


Of course, this means enhanced understanding.


This means that the AI could understand context and nuances in a more human-like way, leading to more accurate and relevant responses.


Decision-making, these models would likely be able to make more sophisticated decisions based on the information provided, similar to higher-order thinking.


Like I said before, this is just, once again, gonna open up a lot more applications.


They even spoke about how it's going to be able to do medical research.


We've seen that Google has been pushing widely on that frontier with the medical Gemini, and they've achieved remarkable benchmarks.


I wouldn't be surprised if OpenAI are doing something in that ReALM.


One of the things that I also want to talk about was, of course, the fact that this release date is rather fascinating.


The name of the model did actually make me think about something that was spoke about previously.


You can see here that we have a model that is called GPT-Next, but one of the things that I spoke about when covering the Sam Altman and Lex Fridman interview was the fact that he said something rather insightful.


He said that the future models that he releases might not actually be called GPT-5.


Of course, there might be GPT-5 because they did trademark it, but he did state that whatever the next model may be, we're not sure when it's going to be released or what it's going to be called.


That's the honest answer.


Is it blink twice if it's this year?


Before we talk about a GPT-5-like model called that or not called that or a little bit worse or a little bit better than what you'd expect from a GPT-5, I know we have a lot of other important things to release first.


I don't know what to expect from GPT-5.


You're making me nervous and excited.


What are some of the... right there, you can see that Sam Altman is actively talking about how they are going to release a few things before GPT-5.


Of course, we've seen things like voice engine, we've seen Sora, we've seen a bunch of other things, but the way how he talks about how these future models might not even be called what we are expecting them to be called is, of course, rather fascinating too.


One of the things that you may have seen recently was this from Microsoft, and this is basically where they talk about the levels of compute that they're using to train the next frontier models.


Currently we can see that the diagram, they use sharks and marine life to, I guess you could say, help us understand the scale of compute that they are currently using.


We can see that we have a shark here, then of course we have an orca.


Of course, we have a whale.


I mean, the stock increase in terms of the capabilities from this graph going back all the way to the first graph are very, very similar.


The GPT-3, the GPT-4 technology is only a little bit, and then of course the next levels, I think maybe OpenAI clearly have discovered something incredible, and they're probably going to shock the world because if you're using that much compute to train something, and you've also improved your architecture, then I think the amount of capabilities that you can get is truly, truly surprising.


I think this is so surprising because not only do we have maybe not improved architectures in terms of the transformer, but I'm talking about certain techniques that OpenAI are pioneering and that they're using to advance the frontier in terms of the reasoning and the capabilities of their models.


I'm going to show you guys a short snippet from this clip where it's actually spoken about in great context about why this is so pivotal.


The only reason I'm showing you this is because now with the added context from this slide here, where we can see that away.


The only reason I'm showing you guys this clip is because now with the added context of this previous graph, where we can literally see in terms of the capabilities jump, I think it's important to understand the compute side behind it.


That frontier forward.


Like we showed this slide at the beginning, like there's this like really beautiful relationship right now between sort of exponential progression of compute that we're applying to building the platform, to the capability and power of the platform that we get.


I just wanted to, sort of without mentioning numbers, which is sort of hard to do, to give you all an idea of the scaling of these systems.


In 2020, we built our first AI supercomputer for OpenAI.


It's the supercomputing environment that trained GPT-3.


Like, we're going to just choose marine wildlife as our scale marker.


You can think of that system about as big as a shark.


The next system that we built, scale-wise is about as big as an orca.


Like that is the system that we delivered in 2022 that trained GPT-4.


The system that we have just deployed is like scale-wise about as big as a whale relative to like, the shark size supercomputer and this orca size supercomputer.


It turns out like you can build a whole hell of a lot with a whale size supercomputer.


One of the things that I just want everybody to really, really be thinking clearly about, and like, this is going to be our segue to talking with Sam is the next sample is coming.


Like this whale size supercomputer is hard at work right now, building the next set of capabilities that we're going to put into your hands.


If you saw it there, he said, you can build a whole hell of a lot of AI with a large amount of compute.


I'm really intrigued with as to what a whole hell of a lot of compute is going to be giving us.


But one thing I do note is that there is going to be a huge amount of capabilities.


Something that they also spoke about was of course, multimodal agents.


This is going to be something that is here within the next level of frontier state-of-the-art models.


I think that maybe this year we get something but there are also some other things I do want to talk about with as to why we might not get that.


But they also demoed the multimodal agents.


Of course, you can see that their investment areas are the textual intelligence, cheaper and faster models, the custom models, and of course, multimodal agents.


I want to show you guys this short clip because OpenAI haven't really shown us that much in terms of the agentic workflows.


But I think it's important to take a sneak peek because agents are truly going to change the way we interact with computers.


We really believe that in the future, agents may be the biggest change that will happen to software and how we interact with computers.


Depending on the task, they'll be able to leverage text, they'll be able to leverage access to some context and tools.


Again, all of these modalities that we mentioned will bring also a fully natural and novel way to interact with the software.


One example of this that I personally love is DevIn by the team at Cognition.


They built essentially an AI software engineer.


It's pretty fascinating because it's able to take a complex task and it's able to not just write code, but it's able to also understand the task, create tickets, browse the internet for documentation when it needs to fetch new information.


It's able to deploy solutions to create pull requests and so on.


It's one of those agentic use cases that I really love.


In fact, this tweet from Paul Graham earlier this year caught my eye because he mentioned or realized that the 22-year-old programmers these days are often as good as the 28-year-old programmers.


I think when you reason about how the 20-year-olds are already adopting AI and tools like DevIn, it's no surprise that they're getting more and more productive thanks to AI.


Another agent experience that I think this time is more towards consumer is Presto.


Presto is letting customers place orders with their voice, so using a voice agent.


Of course, there's not many drive-throughs here in Europe.


But what I found compelling about this example is that it's really helping a market where there's been a labor shortage.


In turn, that helps offer not only a great experience, but also let the staff actually focus on food and serving the customers.


But with that, I'd like to dive into a couple more live demos to illustrate a little bit how you can build assistive experiences and agents practically today.


Our first incarnation of.


With that, you can see that literally one of the things that this AI-powered drive-through system is, is it's actually been impacting people.


Because one of the things that you might not understand about drive-throughs is that they're kind of limited to human intelligence.


One of the things I was thinking about when I saw a demo in a weekly AI video, I covered when someone was actually going through a drive-through with an AI system.


They basically spoke about how it was so crazy because an AI system is able to completely understand exactly what you want.


It's able to understand exactly what you want in other languages too.


It's also able to converse with you in other languages too, much more fluently than just someone who only speaks one language and isn't bilingual or able to understand other languages.


It's patient and it's fast.


I think it's something that's going to allow a lot more unique experiences.


That's why agents are something that is very, very impactful because I think this is where you're really going to see that real-life impact other than just in a day-to-day LLM interface.


Welcome to Wendy's.


What would you like?


Can I have a chocolate frosty?


Which size for the chocolate frosty?




Can I get you anything else today?


No, thank you.


Please pull up to the next window.


Let's take a look at some of these demos of these agentic workflows that you can actually use and do and what they've shown us in this presentation.


Incarnation of agents for developers is what we call the Assistance API.


The Assistance API is a complete toolkit that all of you can use in order to bring assistance into your products.


In this case here, I'm building this travel app called Wanderlust.


As you can see, there's a map on the right side, but there's also an assistive experience on the left side.


This is completely powered by the Assistance API.


Let's take a quick look.


If I say, top five venues for the Olympics in Paris, first of all, first thing to note, I don't have to manage any of those.


Let's refresh the app a little bit.


Sounds like we maybe lost network.


Top five venues for the Paris Olympics.


The first thing to note is that I don't have to manage that conversation history.


That conversation history is automatically managed by the Assistance API from OpenAI.

その会話履歴はOpenAIのAssistance APIによって自動的に管理されます。

I don't have to manage my prompt and so on.


Not sure what's happening here.


Let's take a quick look.


Might have lost some Wi-Fi or connection.




Let's try it one last time.


Let's go to Rome.


There we go.


Sounds like the Olympics was bad luck, but it sounds like we're back.


I don't have to actually manage any of those messages.


The conversation history is automatically managed by OpenAI.


The second thing that's really cool to go out here is that, as you could see, when I started to interact with these messages, the map zoomed automatically.


That's one of my favorite features when I build agents.


It's called function calling.


Function calling is the ability for all of you to bring knowledge about your unique features in your app and your unique functions over to the model, in this case, GPT-4.


If I say top five things to see in Rome, let's see what happens here.


In theory, what should pop up here is, once again, an interaction between the text and the map.


Here we go.


As you can see, as we are talking to the model, it's able to actually pinpoint the map because it knows that this feature exists.


It's really, really cool.


That's already available as part of the toolkit of the Assistant CPI.


Another tool I wanted to call out here is knowledge retrieval.


We know so many of you want to bring factual data into the conversations with models like GPT-4.


Usually, you have to build a retrieval stack to do so.


We've learned from so many developers how complex that can be.


We've made a ton of improvements in our retrieval stack.


I'm going to try to see if I can actually demo this in real time.


I actually bought this book to prepare a trip to Italy from Lonely Planet.


It's a pretty comprehensive book.


It has like 250 pages.


It's like 95 megabytes.


I hope the upload is going to work.


I'm taking a bit of a risk here.


But what's happening in real time as soon as the file will be uploaded, it will be automatically embedded by the Assistant CPI so that I don't have to think about any of these things to do.


I will be able to just start interacting in the conversation and say, based on this book, what's the best photo spot in Lazio?


Before I press Enter, I'll show you a quick look at page 126, I believe.


Let's go to page 126.


The page 126 talks about Lazio, right?


I'm going to ask the question here.


What's the best photo spot in Lazio?


As I'm browsing the book, we're noticing here that the photo opportunity was mentioned on page 128.


It's supposed to be Pitigliano.


Boom, in real time, we were able to find in this book that this is exactly the place for a photo spot.


Again, I had to do no engineering work.


I just had to upload the file in the conversation and it was all taken care of for me.


Last but not least, there's also another tool that I wanted to highlight called Code Interpreter.

最後に、Code Interpreterというもう1つのツールを強調したいと思います。

Code Interpreter is this ability to write Python code in the background to answer some very precise questions, usually around numbers and math and financial data.

Code Interpreterは、通常、数値や数学、金融データに関する非常に正確な質問に答えるためにバックグラウンドでPythonコードを書く能力です。

Here, for instance, if I were to say in this conversation, we are sharing an Airbnb for four.


It's 1,200 euros.


What's my share plus my flight cost of 260?


By asking this question, this is not a typical thing that LLMs do great at by default, right?


But what's happening behind the scenes is that we're actually computing all of this, including currency conversion and so on, by writing code in the sandbox.


Once again, as a developer, I have nothing to do.


But because Aponia is managing this, does not mean it's a black box.


In fact, if I go here and if we refresh the threads, we should see here that this is the exact threads that we've been feeding.


You can see we're going to Rome.


Like all of the messages, we see the function calls that I highlighted to annotate the map.


Here, this is the Python code that was written behind the scenes to actually answer the question, compute the currency conversion, divide by the number of people, and so on.


Really, like the Assistant API, complete toolkit with conversation history, with access to retrieval and files.


You can upload now up to 10,000 files in retrieval, and even Code Interpreter and function calling.


All of this we can build on from day one.


Let me know what you think about future models.


I mean, one of the things that is a little bit confusing is the fact that they do have GPT-6 and other names trademarked.


I'm wondering if they're just going to continue with the traditional methods.


But it is quite hard to predict, considering the fact that OpenAI is a company that comes with a lot of drama and, of course, a lot of surprise, and with the rate that AI is exponentially increasing in terms of the capabilities and everything new being discovered, what feels like every week.


I mean, the capabilities trying to predict a year, two years, three years from now are quite hard.


But I think from this, we do know that in November, there's probably going to be a new model released.


Whether it is GPT-Next, whether it is GPT-5, I can say one thing is certain is that it's going to be a monumental leap in terms of the capabilities and usability of what we're about to see.

