OpenAI’s o3 and o4-mini Make Multimodal Reasoning Breakthroughs
How the companies newest models continue an industry trend towards autonomous AI
Welcome to another edition of AI 101, where every Wednesday we bring you the biggest AI update of the week.
This Week’s Update: OpenAI Releases o3 and o4-mini
On April 16th, OpenAI released o3 and o4-mini, their most advanced models to date. They are the latest addition to the o-series, a group of models with advanced reasoning that spend time “thinking” before responding to users. For the first time, the chatbots will have access to the full suite of ChatGPT tools, including web search, file analysis, and image generation, when responding to user queries. They were trained through reinforcement learning to understand when and how to use these tools independently.
o3 and o4-mini are also the first models that integrate visuals into their chain-of-thought reasoning. The chatbots can analyze images more effectively by cropping, rotating, or zooming in. The new models have performed significantly better than their predecessors on key multimodal and visual benchmarks.
o3 and o4-mini are available today for paid users, and free users can try o4-mini by selecting “Think” before submitting a query.
Why This Is Important
OpenAI focused on the reasoning and agentic capabilities of o3 in their announcement, stating that the new model is a “step towards agentic ChatGPT that can independently execute tasks on your behalf.” The statement is representative of a wider trend in the industry. Many companies are shifting their focus to building AI agents that operate with minimal user input. So far, agents have struggled to complete tasks independently. However, OpenAI’s newest models tested well on instruction following and agentic benchmarks and could prove to be the next step toward autonomous AI.
Quick Hits:
Nvidia announced that the United States is restricting the sale of its H20 semiconductors to China. Soon after, the Chinese technology company Huawei announced plans to begin mass shipments of its advanced 910C chips to Chinese customers.
Sources report that OpenAI is building an “X-like” social network focused on ChatGPT’s image generation abilities.