Wednesday, November 26, 2025

Google’s New Gemini 3 AI Crushed OpenAI and Anthropic in a Benchmark Test for Business Operations

Google has released Gemini 3, the latest in its line of advanced AI models. As most AI companies do when announcing a new flagship model, Google boasted that Gemini 3 is its most intelligent model yet, and tops several benchmarks, including one that judges an AI’s ability to run a business. Google has also released a new application to supplement Gemini 3’s coding power. After months of teasing, Google CEO Sundar Pichai finally announced Gemini 3 in a blog post, saying that it enables anyone to “bring any idea to life.” The model is now integrated throughout much of Google’s ecosystem, including its search engine’s AI Mode, Google AI Studio, and the Gemini App. Pichai said that Gemini 3 is “much better at figuring out the context and intent behind your request, so you get what you need with less prompting.” Gemini 3 will be a family of models that vary in size and price. For now, the only model available is Gemini 3 Pro, which is the largest and most expensive version. Over time, smaller and cheaper versions of the model will be released. Gemini 3 Pro also includes a “Deep Think” mode, which has become standard across AI platforms. By activating this mode, Gemini can think even longer and harder about how to solve complex problems. Demis Hassabis, CEO of Google DeepMind, wrote that Gemini 3 is “the best model in the world for multimodal understanding and our most powerful agentic and vibe coding model yet, delivering richer visualizations and deeper interactivity.” By multimodal, he’s referring to the capability of AI models to process and generate content across a variety of mediums, including text, images, and video. Vibe coding refers to the practice of directing AI agents to write and execute code on your behalf, and has been a major AI topic in 2025. In its blog post, Google also claimed that Gemini 3 Pro is significantly less sycophantic than other AI models. “Its responses are smart, concise and direct, trading cliché and flattery for genuine insight,” the company wrote, “telling you what you need to hear, not just what you want to hear.” According to Google’s own testing, Gemini 3 Pro tops several widely-used AI benchmarks, including MMMU, which gauges multimodal understanding, and Terminal-Bench, which judges a model’s ability to code within a computer terminal. One notable leaderboard that Gemini 3 Pro topped was Vending-Bench 2, a benchmark that measures an AI model’s ability to run a business (in this case a vending machine) over a long period of time. After a full simulated year of operation, Gemini 3’s bank account balance was $5,478.16, much higher than second place finisher Claude Sonnet 4.5, which ended the virtual year with $3,838.74. Google clearly has high hopes for Gemini 3 in the coding domain. Along with the new model, the company has released Google Antigravity, a new agentic development platform that will likely compete with fast-growing startup Cursor, which sells its own AI-powered integrated development environment (IDE). Google Antigravity gives AI agents access to a code editor, terminal, and browser. In addition to Gemini 3, Google Antigravity users will also be able to select Anthropic’s Claude models and OpenAI’s open-weights model. Google says that Antigravity also comes “tightly coupled” with Nano Banana, the company’s popular image-editing model. For nontechnical founders who might be intimated by the technical details of Antigravity but want to try their hand at AI coding, Google has brought Gemini 3 Pro to Google AI Studio, a web-based application designed specifically for those without coding experience. In a blog post, Google AI Studio product lead Logan Kilpatrick wrote that Gemini 3 Pro “can translate a high-level idea into a fully interactive app with a single prompt. It handles the heavy lifting of multi-step planning and coding details delivering richer visuals and deeper interactivity, allowing you to focus on the creative vision.” Gemini 3 Pro is currently available for enterprise use for members of Google’s Gemini Enterprise platform. Google says that several businesses are already using Gemini 3 Pro, including Box, Cursor, Harvey, Replit, Thomson Reuters, and Shopify. Gemini 3 Pro costs $2 per million tokens on input prompts that are smaller than 200,000 tokens, and $12 for per million tokens generated. Tokens are units of data that are processed and generated by AI models. BY BEN SHERRY @BENLUCASSHERRY

No comments: