OpenAI has just launched GPT-4o Mini, their latest AI model designed to be smaller, cheaper, and faster than previous iterations. Starting today, developers and consumers using the ChatGPT web and mobile apps can access this new model, with enterprise users set to gain access next week. GPT-4o Mini promises to outperform existing small AI models on various reasoning tasks involving both text and vision. With the rise of small AI models, developers are increasingly opting for these due to their enhanced speed and cost efficiency compared to larger models like GPT-4 Omni or Claude 3.5 Sonnet. These compact models are ideal for high-volume, simple tasks that developers frequently rely on AI to perform.
Replacing GPT-3.5 Turbo as OpenAI’s smallest offering, GPT-4o Mini boasts impressive benchmark scores. It achieves 82% on the MMLU benchmark, which measures reasoning, surpassing Gemini 1.5 Flash’s 79% and Claude 3 Haiku’s 75%. In terms of math reasoning, as measured by MGSM, GPT-4o Mini scores an impressive 87%, outperforming Flash’s 78% and Haiku’s 72%.
OpenAI emphasizes the affordability of GPT-4o Mini, which is over 60% cheaper to run than its predecessor, GPT-3.5 Turbo. Currently, the model supports text and vision in the API, with plans to include video and audio capabilities in the future. Olivier Godement, OpenAI’s head of Product API, commented, “For every corner of the world to be empowered by AI, we need to make the models much more affordable. I think GPT-4o Mini is a really big step forward in that direction.”
For developers using OpenAI’s API, GPT-4o Mini is priced at 15 cents per million input tokens and 60 cents per million output tokens. The model features a context window of 128,000 tokens—equivalent to the length of a book—and has a knowledge cutoff of October 2023. While OpenAI has not disclosed the exact size of GPT-4o Mini, it is said to be comparable to other small AI models like Llama 3 8b, Claude Haiku, and Gemini 1.5 Flash. However, GPT-4o Mini claims superiority in speed, cost efficiency, and intelligence based on pre-launch tests in the LMSYS.org chatbot arena. Early independent tests confirm these claims, with George Cameron, Co-Founder at Artificial Analysis, noting, “Relative to comparable models, GPT-4o Mini is very fast, with a median output speed of 202 tokens per second. This is more than 2X faster than GPT-4o and GPT-3.5 Turbo and represents a compelling offering for speed-dependent use-cases including many consumer applications and agentic approaches to using LLMs.”
Alongside the GPT-4o Mini launch, OpenAI has introduced new tools for enterprise customers. The Enterprise Compliance API is designed to help businesses in highly regulated industries—such as finance, healthcare, legal services, and government—comply with logging and audit requirements. These tools will enable admins to audit and manage their ChatGPT Enterprise data, providing records of time-stamped interactions, including conversations, uploaded files, workspace users, and more. Additionally, OpenAI is offering admins more granular control for workspace GPTs, which are custom versions of ChatGPT created for specific business use cases. Previously, admins could only fully allow or block GPT actions created in their workspace, but now they can create an approved list of domains that GPTs can interact with.
OpenAI’s release of GPT-4o Mini marks a significant advancement in the field of small AI models, offering an affordable and efficient solution for developers and enterprises alike. With its impressive performance metrics and cost efficiency, GPT-4o Mini is set to become a popular choice for high-volume, simple AI tasks. Moreover, the new enterprise tools provide enhanced compliance and control features, making AI integration more secure and manageable for businesses in regulated industries.