IMPACT: 2025

Friday, October 3, 2025

This Company Says 1 New AI Feature Can Handle 20 Hours of Work in Seconds

Popular wedding planning platform The Knot has released an update to its mobile app that uses AI to streamline the process of finding local vendors. In a press release, the company said that the new update “cuts over 20 hours of planning work to just seconds.” The reimagined “planning experience,” as The Knot calls it, allows couples to browse through thousands of photos of weddings to create a vision board. By clicking an icon, users can activate a new feature called “make it yours,” which scans the image and then searches through The Knot’s database of venues and vendors to find similar options that “fit your vibe, budget, and location.” Christine Brown, The Knot’s VP of product, says that the company built this new AI feature entirely in-house, rather than relying on AI models from external providers like OpenAI or Anthropic. To create the feature, Brown says the company trained its own models on “more than a million images accessible on The Knot.” To test its effectiveness, The Knot ran a two-month pilot in which thousands of couples were given early access to the tool. As an example of how the new feature can help amateur wedding planners save time, Brown pointed to one of the most time-consuming aspects of throwing a wedding: picking a venue. Brown’s team estimated that most couples take roughly six weeks to pick a venue, spending 3.5 hours per week devoted to the search. That adds up to 21 hours of total searching time, which Brown says can now be reduced to minutes thanks to this new tool. The Knot says that this update is just the first step in a larger push to introduce AI-powered wedding planning features. As for what’s next, she says the company is building AI tools to help both couples and professional wedding planners and vendors. One of those tools is an AI-assisted email reply feature that allows vendors to convert more leads into bookings. “We see AI as a powerful force to support the planning journey,” Brown says, “helping couples and vendors save time, while still keeping personalization and human touch at the heart of the wedding experience.” BY BEN SHERRY @BENLUCASSHERRY

Wednesday, October 1, 2025

Microsoft Is Adding Anthropic’s Claude to Its AI Tools. Here’s What It Can Do for Businesses

Microsoft is expanding the lineup of AI models used to power 365 Copilot, its workplace-focused AI service. The move is a sign that Microsoft is actively working to lessen its reliance on OpenAI‘s models after investing over $10 billion in the company. In its blog post announcing the news, Microsoft said that while 365 Copilot will continue to be primarily powered by OpenAI’s models, users will now be able to harness Anthropic’s models in two specific ways. One is in Researcher, a 365 Copilot feature that searches the internet and analyzes internal data like emails, Teams chats, and files, in order to conduct deep research. Normally, Researcher runs on models developed by OpenAI, but 365 Copilot customers will now have the option of using Claude’s Opus 4.1 model (the company’s most advanced model currently available) instead. Microsoft said that Opus 4.1 in Researcher could be used to accomplish tasks like “building a detailed go-to-market strategy, analyzing emerging product trends, or creating a comprehensive quarterly report.” The other method for using Claude in 365 Copilot is within Copilot Studio, a feature that enables users to build customized AI agents that can automate workflows. Users will now be able to easily select Claude Opus 4.1 or Claude Sonnet 4 (Anthropic’s mid-sized model) when creating agents. Microsoft says users will even be able to orchestrate whole teams of agents, all powered by different AI models, to work in tandem in order to accomplish tasks. Workplaces with Microsoft 365 Copilot licenses can now use Claude in Researcher and Copilot Studio, but only if opted-in by an administrator. Microsoft wrote that “this is just the beginning,” and that users should stay tuned for Anthropic models to “bring even more powerful experiences to Microsoft 365 Copilot.” Microsoft is also reportedly working on an AI marketplace for news and media publishers, according to Axios. The marketplace would enable publishers to sell their content to AI companies, who would in turn use that content to train their new AI models. Axios reported that Microsoft discussed plans for the marketplace at its invite-only Partner Summit in Monaco. BY BEN SHERRY @BENLUCASSHERRY

Monday, September 29, 2025

Anthropic’s Claude AI Has 1 Killer Use Case, According to New Data

Software engineering is the overwhelming favorite use case for Claude, Anthropic’s AI model, according to a new report published by the company. The report, the third in a series tracking AI’s economic effects, also breaks down how enterprises are using Anthropic’s AI models. The takeaway? Enterprises are heavily focused on using Claude to automate tasks. The report, titled “Uneven Geographic and Enterprise AI Adoption,” found that 36 percent of sampled conversations on Claude.ai, Anthropic’s ChatGPT-like platform for chatting with Claude, are centered on providing software development assistance. That makes it by far the AI model’s most popular use case. It should come as no surprise, then, that software developers working on applications are Claude’s heaviest users, making up 5.2 percent of all usage. The other top Claude.ai uses, according to Anthropic’s data, include providing assistance with writing, acting as a virtual tutor, conducting research, and supplying financial guidance and investment assistance. The report also tracked how enterprises are using Claude’s API, which enables developers to integrate Claude into their products and software applications. The data shows that businesses are largely using Claude to automate tasks, rather than using it as a learning tool or a collaborator. Anthropic says this shouldn’t come as a surprise, because the API naturally lends itself to automation. “Businesses provide context,” the company explained, “Claude executes the task, and the output flows directly to end users or downstream systems.” Like with Claude.ai, according to the report, software development is by far the most popular use for enterprises using the Claude API, with just under half of all API traffic accounting for computer and mathematical tasks. More specifically, 6.1 percent of all Claude API use is for resolving technical issues and workflow problems in software development; 6 percent is for debugging and developing front-end code and components for web applications; 5.2 percent for developing or managing professional business software; and 4.9 percent for troubleshooting and optimizing software. In the report, Anthropic wrote that code generation tasks dominate API traffic “because they hit a sweet spot where model capabilities excel, deployment barriers are minimal, and employees can adopt the new technology quickly.” But coding isn’t the only way that enterprises are using Claude. Ten percent of API usage comes in the form of office and administrative tasks, 7 percent is for science tasks, 4 percent is for sales and marketing tasks, and 3 percent is for business and financial operations. The report also examined how the cost of using Claude to handle specific tasks correlates with usage amounts. According to the data, tasks typical of computer and mathematical jobs, like coding and data analysis, cost over 50 percent more than sales-related tasks, but still dominate overall use of the tech. This, according to the company, “suggests that cost plays an immaterial role in shaping patterns of enterprise AI deployment.” Rather than focusing on costs, Anthropic postulated, “businesses likely prioritize use in domains where model capabilities are strong and where Claude-powered automation generates enough economic value in excess of the API cost.” The report also revealed how each state in the U.S. typically uses Claude (specifically Claude.ai). Unsurprisingly, California (where Anthropic is based) is far and away the biggest Claude user, accounting for 25.3 percent of total use. Other states with heavy Claude usage include New York (9.3 percent), Texas (6.7 percent), and Virginia (4 percent). BY BEN SHERRY @BENLUCASSHERRY

Friday, September 26, 2025

Gen-Z AI Founders Are Merging Work and Life in These 3 Ways

Young AI founders in San Francisco are upending preconceived notions about Gen Z’s approach to work-life balance. In a recent Wall Street Journal article, founders from ages 18 to 32 described a lifestyle entirely structured around their companies. These founders and their grind-first mindsets are in stark contrast to a 2024 Deloitte survey, which found that while 36 percent of Gen Z respondents consider work to be central to their identity, 25 percent consider work-life balance the top factor in choosing an employer. Far from quiet quitting, these founders are working seven day weeks, living in their offices, and eating only for sustenance. And they couldn’t be happier, at least according to the Journal’s reporting. Here’s what these young founders are doing to win in the AI era. Living in the office Several young founders interviewed by the Journal claimed to be working constantly. Marty Kausas, a 28-year-old founder building an AI startup called Pylon, said he had recently worked three 92-hour weeks in a row. And Nico Laqua, a 25-year-old cofounder of AI-powered insurance startup Corgi, said that he lives in his office and typically spends “every waking hour” working (doggedly, perhaps) on his company. He claims to only hire people willing to work seven days a week. Indeed, Corgi is currently hiring for a chef to provide the team with breakfast, lunch, and dinner seven days a week. Blowing up the work-life balance Even when they’re not physically in the office, these founders are reportedly almost always advancing their business interests in some way. Recent social activities for Kausas include attending a hackathon and taking a bike ride with a fellow founder. Emily Yuan, a Corgi co-founder, told reporters that she and her founder friends spend their free time discussing funding rounds while exercising and going to saunas. De-centralizing food Another common theme among the interviewed founders is their attitudes toward food and meals. Kausas told the Journal that he eats pre-packaged breakfasts and lunches from nutrition and supplements company Blueprint, because “the workday is more efficient if he doesn’t have to think about food.” Haseab Ullah, founder of an AI customer support chatbot, also claimed to have a utilitarian approach to eating. Usually, his only meal of the day is an Uber Eats-delivered treat, a tactic that he said helps him “save time and avoid cooking.” (A young person using Uber Eats to avoid cooking may not be a shocker, but using it to source every meal sounds more extreme.) Michelle Fang, an event planner at VC firm Headline, told the Journal that many founder-focused get-togethers in San Francisco don’t even serve alcohol, both because it is “out of fashion in the San Francisco crowd,” and because many founders “aren’t old enough to drink” yet. BY BEN SHERRY @BENLUCASSHERRY

Thursday, September 25, 2025

OpenAI Introduces GPT-5-Codex, an AI Model Built Just for Coding

OpenAI has announced its newest model, GPT-5-Codex. The new model has been optimized for agentic coding in OpenAI’s suite of AI-powered software engineering tools, which is called Codex. This year, AI programs that can write and edit software have emerged as the most lucrative use case for AI, propelling multiple companies to huge revenue increases. These tools are being used both by professional developers to make their work more efficient, and by casual vibe coders, who lack the technical skill to create websites and apps. The Sam Altman-led company claims that by training this new AI model on real-world engineering tasks, it can outperform the default model. In a benchmark that compared that model and GPT-5-Codex’s ability to refactor code (essentially reorganizing and cleaning up code), GPT-5-Codex scored nearly 20 percent higher than the default model, which is simply called GPT-5. GPT-5-Codex is also said to be a strong independent worker. It can work autonomously on software for long stretches of time. According to a press release, OpenAI has seen the model “work independently for more than seven hours at a time on large, complex tasks, iterating on its implementation, fixing test failures, and ultimately delivering a successful implementation.” The new model could also help alleviate one of the most notable pain points of vibe coding: bad code. Many software developers have remarked that much of their time working with AI-assisted code editors is spent cleaning up the AI’s code, which isn’t always as thoughtfully written as a human expert’s would be. But OpenAI says that GPT-5-Codex has been “trained specifically for conducting code reviews and finding critical flaws.” In practice, the company says, this means GPT-5-Codex will review an entire codebase to identify flaws and autonomously test apps to find errors. OpenAI says that Codex currently handles “the vast majority” of proposed changes to code being written by OpenAI staffers, “catching hundreds of issues every day—often before a human review begins.” But even with its improved code review abilities, OpenAI still recommends using Codex as an additional reviewer; it says in a press release that it is “not a replacement for human reviews.” Unlike the normal version of GPT-5, GPT-5-Codex won’t be immediately available via API, and OpenAI recommends only using the model for coding tasks in Codex-supported environments. In addition, Codex is coming to mobile devices for the first time. Previously, to access Codex, you’d either need to use ChatGPT on a desktop computer or invoke Codex in an IDE (integrated development environment) like VSCode or Cursor. Now, Codex will be accessible in the ChatGPT iOS app, enabling easier coding on the go. Codex, and GPT-5-Codex, is available across all of ChatGPT’s paid tiers, with $20-per-month ChatGPT Plus members getting enough access to “cover a few focused coding sessions each week.” Meanwhile, $200-per-month ChatGPT Pro members will get enough to “support a full workweek across multiple projects.” Companies that pay for ChatGPT’s SMB-focused Business plan can purchase credits to give their developers more access to Codex, while larger companies with ChatGPT’s Enterprise plan get a shared credit pool. In OpenAI’s press release, engineers and tech leads at companies including Cisco, Duolingo, Ramp, Vanta, and Virgin Atlantic praised Codex’s utility, but it remains to be seen if GPT-5-Codex can help OpenAI take market share away from Anthropic, whose similar Claude Code product has proved very popular with professional and casual software developers. BY BEN SHERRY @BENLUCASSHERRY

Monday, September 22, 2025

‘I Feel Like a Better Manager’: Execs Share How AI Transforms How They Lead

It hasn’t taken long for business leaders to discover that AI can help them manage people, and they are using it in ways that executives likely couldn’t have dreamed of several years ago—from reimagining how an org chart works to using AI to help them write a tricky email. We spoke to five CEOs—and one chief human resources officer—to learn how they are harnessing AI to help get the most out of their people. They are: Arvind Jain, CEO, Glean, an AI enterprise search platform that has 1,000 employees and was most recently valued at $7.2 billion, according to PitchBook Stacy Spikes, CEO, MoviePass, a subscription-based movie ticketing service, which recently announced a $100 million capital investment Aakash Shah, CEO, Wyndly, a startup focusing on personalized and modern allergy treatments, which ranked No. 333 on the Inc. 5000 this year Renata Black, CEO, EBY, a membership-based women’s intimate apparel company that has raised more than $18 million, according to PitchBook Ashley Kirkwood, CEO, Speak Your Way to Cash, a sales and speaking training organization Ali Bebo, chief human resources officer, Pearson, a U.K.-based education and testing services company Throughout this process, these leaders are finding that AI is pushing their employees to reach beyond what they thought they were capable of—like setting benchmarks, preparing for one-on-ones, and improving reports before they reach their managers. And it’s giving CEOs a lot to consider when it comes to how they run their workplace. The technology, says Spikes, is “going to create more than it will take away.” 1. Supercharge the org chart Jain, who started Glean in 2019, sees himself as a facilitator. That means he has to make sure every task gets assigned to the right person or group—and for him, an old-fashioned org chart just isn’t good enough. “It gets obsolete quickly, because the world is changing so fast,” he says. That’s why his company, which makes AI tools designed to help businesses find answers and automate workflows, has created a kind of living org chart with AI. Jain says that when he has an idea, he doesn’t have time to waste working out which person or team has capacity to take it on. Instead, he wants to get straight to the person or people who can best work on it with him. Glean’s AI examines employees’ work and contributions in real time, mapping core competencies in a way a traditional org chart can’t. The AI is “constantly observing on any given subject matter who are the top voices, who are the ones who are answering the most questions in Slack or in Teams, who are the ones who are writing authoritative documents on that,” Jain says. He adds that he uses this tool every day to help Glean move fast on new ideas and keep projects on track. Pearson’s Bebo says that employees use an in-house AI agent called CARA that can answer questions about their role and ways that they can excel or get promoted: “She is what I would describe as our people’s friend as they think about navigating their career here,” Bebo says. CARA is designed to act as an enabler, helping both employees and managers be more effective and understand where they are in relation to their job expectations and goals. “We don’t want to have AI replace managers, but we really want to think about how it helps our managers even perform better,” she says. One way Spikes sees AI transforming his workforce at MoviePass is by creating more opportunities for the people he has—and for tomorrow’s hires. “I’m finding that it is overall increasing how you’re going to use people, not decreasing how you’re going to use them,” he says. “I think that’s the beauty of this emerging technology.” 2. Transform meetings from status updates into deep conversations Shah, Wyndly co-founder and CEO, uses AI to better prepare for his one-on-ones, especially with his executive team and co-founder, Manan Shah, his cousin. He sees it as akin to the culinary technique of “mise en place,” where chefs prep everything they need before they turn on the heat. “If we can get everything prepped before we’re ready to jump into the work, it makes the work both more fulfilling but also just more effective,” he says, adding that it also creates room for more interpersonal connections between him and his direct reports. “I think that’s what the difference maker is between a good and a bad leader, at least for me, is whenever I’ve been able to spend more time on the interpersonal stuff, I found that I feel like a better manager,” Shah says. At intimate apparel retailer EBY, CEO Black says the entire company is mandated to use AI to help optimize reports and analysis before presenting anything to her. Black makes them show her their original plan and how they optimized it using AI. As a result, her people have more clarity into what they are doing and how to achieve their full potential, she says. “AI allows them to present information in a much clearer way that allows them to be more confident in what they’re presenting,” she says. In turn, she is able to give them better feedback. 3. Power up performance reviews and employee evaluations Bebo, who joined Pearson in 2021 to assist in its culture and business transformation, says that AI agents are embedded in the performance reviews at the company. But while managers are still doing the employee evaluations, the AI can help both employees and leaders craft sharper and more articulate reviews and self-assessments so that every single one “sounds like Pearson.” The agents aren’t mandatory, but for managers who do use them, Bebo reports that they have sped up the performance review process and helped them deliver meaningful feedback to their employees. Glean takes this approach one step further in its performance reviews, says CEO Jain. While managers and employees use AI prompts to help them write their assessments like at Pearson, Glean also uses AI to collect and analyze each employee’s contribution to the company, enabling managers to have a complete, clear, and—crucially—objective record of everything the employee did during the review period. That combats biases and favoritism, Jain says, but it also means he doesn’t forget or overlook any of his employees’ achievements or sticking points. “The conversation shifts from getting on the same page to, actually, we are already on the same page, and this is now a time to solve problems that you run into so that you can become better, you can grow as an employee,” he says. 4. Use AI prompts to get the best responses from your people Spikes, who co-founded MoviePass in 2011, left in 2018, and then returned to save the company in 2021, says he started using AI prompts with his teams to challenge them to think differently about how they are tackling business challenges or new projects. “That curiosity helps speed up the team,” he says, mentioning that some projects that used to take weeks now take as little as a couple of hours. What that gives him as a manager is not just a faster outcome, but also more opportunities for iterations and feedback, leading to a better outcome. “You get much more of a response loop that you just didn’t have before,” Spikes says. Jain also uses AI prompts with his Glean executive team, asking them to set business goals each week. Then he uses a custom-built AI that helps him track progress and gather insights on each of those goals. He says that gives him a “deep understanding” of precisely where there was forward momentum and where there were slowdowns or blocks. And Wyndly CEO Shah says his business is moving to a similar model. When people do their daily check-in, they are prompted to think about how what they are doing is aligning with the business’s goals and to preempt what questions Shah might have for them based on what they report. That way, he says, “everyone’s speaking the same language.” 5. Let AI be a thought partner Knowing what to say and how to say it is crucial to getting a CEO’s message and vision across to their employees, and AI can act like the ultimate comms specialist and thought partner to do just that. “Anytime I have to write a very complicated email, I just press play. I tell it exactly what I want to say, and then I say, polish this up, and then make it super short and punchy. And it gives me a really strong response,” EBY’s Black says. Speak Your Way to Cash CEO Kirkwood, who published a book with the same name as her company in 2021, agrees that AI can help take the edge off otherwise potentially tense interactions with staff. “If I have to have a difficult conversation, it’s helpful for me to have a script,” she says. “That way I can have it quickly, succinctly, get in and out, and not open up any legal liabilities.” AI can also help temper hard-to-hear feedback so that your employees get the message without getting over-anxious, Black says, adding that because she has a very direct style of communication, AI can help soften her tone without losing impact. “That’s like the AI coaching me on my leadership skills,” she says. Wyndly’s Shah puts it another way: When he wants to send out a company-wide message at Wyndly, AI is a strategy for getting over “blank-page syndrome.” And at Pearson, some of the company’s executives have created digital twins that act as “thought partners,” helping them role-play different conversations and strengthen their arguments, Bebo says. “Think of AI as your friend and a partner,” she says. “It doesn’t replace your owning and delivering and making sure you’re sending the right message. It’s just sharpening the conversation.” BY CLAIRE CAMERON, FREELANCE WRITER

Saturday, September 20, 2025

ELON MUSK xAI STRATEGIC ACQUISITION OF X(FORMERLY TWITTER)

Elon Musk Just Pulled Off His Most Strategic Move Yet—And No One Saw It Coming. His AI company, xAI, just acquired X (formerly Twitter) in a massive $33 billion deal. On the surface, it looks like just another corporate shuffle. But in reality, Musk may have just outmaneuvered the entire AI industry. Here’s why this changes everything: • X is valued at $33 billion • xAI is now worth a staggering $80 billion • The deal is an all-stock transaction, excluding $12 billion in X’s debt At first glance, it seems like Musk took a loss—after all, he originally paid $44 billion for Twitter. But this move isn’t about social media. It’s about something far more valuable: data. The Real Reason Musk Bought Twitter Back in 2022, people were confused. Why would the world’s richest man, known for building rockets and electric cars, want a struggling social media platform? Now, the answer is clear: Twitter (now X) was never just a social media company—it was a massive, real-time data engine. With 600 million active users generating a constant stream of conversations, opinions, and real-world events, X is a goldmine for training AI models. And that’s exactly what xAI needs to take on OpenAI, Anthropic, and Google. The Timing Is No Coincidence - Just a few months ago, xAI secured a $6 billion funding round at a $24 billion valuation. Now, after this acquisition, its valuation has skyrocketed to $80 billion—outpacing even OpenAI’s growth. Why does this matter? Most AI companies struggle to get high-quality, real-world data. Their models rely on stale, pre-existing datasets that don’t reflect real-time human behavior. But xAI now has something its competitors don’t: a live firehose of human interaction. This means: ✅ More human-like AI models ✅ A competitive edge in real-time applications ✅ The ability to train AI on the most up-to-date information available anywhere What Happens Next? This merger isn’t just about an AI assistant inside X. It’s the foundation for something much bigger. 1️⃣ AI-Driven Content & Conversations Expect smarter content recommendations that understand not just what you like, but why you like it. AI-generated insights, real-time fact-checking, and even automated dispute resolution could change how people engage online. 2️⃣ X Becomes More Than Social Media This could push X toward becoming a full-fledged “everything app”—integrating AI-powered tools for content creation, virtual assistants, and even education. 3️⃣ Regulatory Strategy at Play By structuring the deal as xAI acquiring X (instead of the other way around), Musk positions this as an AI-driven initiative rather than a social media consolidation—potentially avoiding regulatory roadblocks. The Bottom Line This isn’t just another tech merger. It’s a calculated move that positions xAI as a major player in AI, while using X’s data to supercharge its models. Musk isn’t just competing with OpenAI, Google, and Anthropic. He’s changing the game entirely.

Wednesday, September 17, 2025

OpenAI Introduces GPT-5-Codex, an AI Model Built Just for Coding

Monday, September 15, 2025

Anthropic Says This AI Tool Can Now Create and Edit Documents

Anthropic’s Claude AI has been updated with the ability to create and edit files, including PDFs, Excel spreadsheets, Word documents, Google docs, and more. Anthropic made their announcement on their blog, explaining that the new features live on its consumer-facing platform, Claude.ai. Until now, the platform could analyze files, but couldn’t create or manipulate them. (Claude.ai is basically Anthropic’s version of ChatGPT.) In a video detailing how the new feature works, a user asks Claude to help them analyze revenue data for their small food truck fleet and package the findings in a Google doc. After the user uploads a few CSV files containing the data, Claude performs its analysis, creates a series of data visualizations, and puts it all together in a handy DOCX file that can either be downloaded or opened directly in Google Drive. “Whether you need a customer segmentation analysis, sales forecasting, or budget tracking,” Anthropic wrote in its blog, “Claude handles the technical work and produces the files you need.” To create files, Claude uses what Anthropic refers to as a “private computer environment,” in which the AI model can write code and run programs. This is similar to ChatGPT’s recently announced agent mode, which gives the AI platform access to a virtual browser that it can use to navigate the internet. These features, which involve giving an AI model access to additional tools, are referred to as agentic capabilities. The company advises starting “with straightforward tasks like data cleaning or simple reports,” and then working up to “complex projects like financial models once you’re comfortable with how Claude handles files.” Currently, when users ask Claude to create a document or spreadsheet, the model opens a window called an Artifact, which is essentially an interactive block of content. Prior to the release of these new features, if you were to ask for a document, Claude would create a document Artifact. If you asked for a spreadsheet, it would create an interactive Artifact. Now, instead of keeping those Artifacts contained within chats, users can download and use their AI-created files. Anthropic says that file creation is currently available for workplace-based Claude Team and Enterprise users, and Claude Max subscribers, who pay $200 per month to the company. Claude Plus users, who pay $20 per month, will get access to the feature “in the coming weeks.” BY BEN SHERRY @BENLUCASSHERRY

Friday, September 12, 2025

The Best AI Success Stories Are Sitting on Hard Drives and Have 1 User

I had coffee with my favorite CTO yesterday and he told me about his new AI app. It’s basically a CTO-in-a box. And it’s awesome. And he’s the only one using it. And it’s going to stay that way. Despite my trying to persuade him otherwise. One of the reasons there’s so little proof of the value of AI is that the best, most useful, most ingenious apps actually never leave the creator’s hard drive. In fact, once my friend pointed out what he was doing, I myself realized that most of what I’ve created with AI is available only to me on my hard drive, and moreover, that’s definitely where my best stuff is. In fact, it seems like most of the better “AI apps” aren’t even primarily AI, but AI being implemented, like my CTO friend implemented it, to unlock automation and unstructured data — and ultimately narrative output — in a way that couldn’t be done before. So why is this happening? The Genius of CTO-in-a-Box I’m probably overhyping this because he’s my buddy and he kindly listens to a lot of my BS before it gets to you folks, but my CTO friend’s CTO-in-a-box isn’t anything to eff with. He and I worked shoulder-to-shoulder for years, and together we developed some amazing little features, a few apps, and the tech backbone of a multimillion-dollar business. I say “we” but all I did was dream stuff up with him, vet it, and MVP it out, after which he and his brilliant team coded it. And they got it right the first time every time, and he usually added his own flair to surprise me with some technical trick no one would ever notice but made what we were doing 10 times better under the hood. He left that company not long after I did, and despite my trying to wrangle him into what I was doing, he took another job, to come in and do a technical turnaround on a private equity-purchased startup that had tons of potential but was stagnating. He hadn’t done anything like a turnaround before and I had just finished one. We have coffee every two weeks and so our conversations turned to the science of the turnaround. Then he disappeared for a month, and when we got back together, yesterday, he shocked the hell out of me. “Basically, what I did was take every bit of data, company data, sales data, all the code, all the documentation — they had a lot of ‘stuff’ [his air quotes] just sitting in directories and databases,” he told me. “I slammed it all into a vector database, wrote some code, integrated Claude Code to build some agents and totally write the front end, and now the LLM is like my personal assistant.” He’s underselling it. I know this because of the example he gave me. Builders Gonna Build “We had a sudden spike in resources, so I asked it what was going on, and it brought me to the right section of code that was the problem and hypothesized why, and I fixed it in 30 seconds,” he said. And then he made me jealous. “Oh, it also does all my weekly status reports and my standup agenda and all the reporting I have to do for the ELT and the board,” he continued. “I don’t let it send emails, but it’ll create the draft for me to review with the summary and a link to the report.” “Tell me you built it so anyone can use it,” I said. “Of course,” he responded. “I mean, not for all the outliers, but yeah you could start over and import new data, it knows what it’s getting and what to do with it.” “Tell me it’s self-perpetuating with new data it creates on its own,” I said, “like those email summaries and reports.” He just smiled. “Dude,” I said and threw my hands up. “It’s a CTO-in-a-box. Let me at it.” “No,” he laughed. “It’s staying on my hard drive.” “But you built it like a product.” “Because that’s how I roll.” Then he took a smug sip of his mocha whatever and I couldn’t help not being mad at him. Don’t Be So Quick to Write Off AI I say this as the guy who can’t stop writing off AI. Nah, I’ve been disparaging how we’ve been selling AI for years now, having been building it since 2010, and, in a nascent sense, as far back as 2000. But each time I’ve firebombed today’s AI hype in public, especially generative AI — because that’s the “AI” everyone is familiar with and what 95 percent of people are talking about when they say “AI” — I’ve prefaced my flaming with how amazing the technology actually can be when you know what you’re doing. In the hands of my CTO friend, amazing doesn’t even begin to describe what you can do. For the record, he’s on the uppermost subscription level of at least five different providers, a four-figure-a-month bill footed by his private equity overlords. And he’s aware that he will be squeezed soon. In fact, he said openly, “I got on the gravy train while the platforms are loss-leading.” They’ll price him out, and that’s another reason not to build a public product around it. He doesn’t know the true economics. Do What the CTOs Are Doing Of course, I asked my CTO friend to send me his documentation, because of course he documented it, and I’m building something around content and creators that could use its own CTO-in-a-box. And that got me thinking. Right now, all the coding I’ve done with the AI and the agents and such, it’s all sitting on my hard drive, and like my friend, I’ve built it like a product but I’m the only user in the credentials table. But unlike my friend, I built it like a product because I am indeed thinking of packaging it and selling it as a product down the road. If I could just stop writing for a while and get my brain on it for more than five minutes. Which, in today’s world, actually gets a lot of Claude coding done. It’s the peer review that takes time, if you get me. If I’ve got advice, it’s this. If you want to build something with AI, find the people who are doing amazing things on their hard drive — facing real challenges, solving real problems, and not just leveraging AI to jump on the gravy train. Buy them a mocha whatever and ask them what they’re doing and how they’re doing it. Because the more my CTO friend spoke, the more my vision was clouded by dollar signs. The problem is that for every story like his I hear 100 more stories about chatbot wrappers and unstructured data parsers being sold like they’re magic. Those aren’t being funded anymore, finally. That opens the door for people to wring real value and usage out of this AI nonsense. If you’re a fan of real value and usage, jump on my email list. I try to talk about that as much as possible, whether that’s AI or tech or something else. EXPERT OPINION BY JOE PROCOPIO, FOUNDER, JOEPROCOPIO.COM @JPROCO

Wednesday, September 10, 2025

Mark Cuban Has 2 Words for People Who Don’t Want to Learn AI

Skims founding partner and sometimes visiting Shark Tank Shark Emma Grede was never an AI skeptic, exactly. In 2023, she offered a cash bonus to her staff for finding creative ways to use AI in their work. But she herself was mostly just using ChatGPT as an occasional replacement for Google search. “I’m using AI like a 42-year-old woman,” she joked in a recent Fortune interview. Then she had former Shark Mark Cuban on her podcast. Turns out the billionaire founder and former Mavs owner has strong words — two, to be exact — for people like Grede who are dragging their feet on experimenting with AI. Talking to Cuban was enough to convince Grede to change her approach. She started Googling class on AI and downloading AI apps immediately. The episode “gave me a new urgency around how I use AI,” she told Fortune. “He gave me a kick.” It might be just the kick you need too. Not learning AI? Mark Cuban says “you’re f***ed” On her podcast, Grede didn’t ask Cuban about AI. She asked him about how to get started with a business idea. But the billionaire entrepreneur insisted that now, there’s no difference between going from idea to execution and utilizing AI. You need the latter to do the former fast and well. “The first thing you have to do is learn AI,” Cuban responded. “Whether it’s ChatGPT, Gemini, Perplexity, Claude, you’ve got to spend tons and tons and tons of time just learning how it works and how to ask it questions.” Noodling around with new tools and asking various AI models questions is how Cuban is spending his time at the moment. And he has no patience for founders and others in business who aren’t doing the same. “What do you say to someone who is like, ‘I don’t like AI. I don’t want any more technology in my life’?” Grede asked. Cuban’s answer was short, punchy, and profane: “You’re f***ed.” Is Mark Cuban right? Cuban went on to explain that the current moment is much like his early career at the dawn of the internet age. New, hugely disruptive technology is rolling out at an incredible rate. Those who don’t run to keep up are going to end up as roadkill. Saying you don’t want to use AI, he says, “is like people saying back in the day, I don’t want to use the PC. I don’t want to use the internet. I don’t need a cellphone, Wi-Fi.” Those businesses died. Is he right in making the comparison? He’s certainly correct that those around you are adopting AI at a rate equal to or greater than the rate at which the internet took off. Harvard researchers have compared recent data on AI usage to government data on the uptake of new technology at the turn of the millennium. They found more people are using AI more quickly these days than people started adopting the internet back then. “The usage rate [for AI] … is actually higher than both personal computers and the internet at the same stage in their product cycles,” the trio of researchers explained to The Harvard Gazette. No one can predict the future. And the breathlessness of some discussions of AI certainly suggest that the hype will exceed the reality in plenty of areas. We may yet witness an AI “trough of disillusionment” or even crash. But the numbers strongly suggest that Mark Cuban is on to something when he says that ignoring AI is just not a viable option. What happened to businesses that ignored the internet? “If you were to go back to 1984 and tell people, ‘Hey, there’s this new thing called the personal computer. I have a crystal ball. Twenty years from now, everybody’s going to have one of these and every single new technological development and every single new product is going to be using it as the base.’ Knowing that now, what would you do differently?” the Harvard researchers ask. “You could make billions and billions of dollars,” they add. According to their data, they say, “it sure looks like generative AI is going to be on that scale,” and “the spoils will go to people who can figure out how to harness it first and best.” How to get started with AI If you’re convinced, how do you start learning AI? Playing around with new tools and technologies as Cuban suggests is certainly a good first step. Elsewhere, Cuban — along with other tech icons like Tim Cook and Bill Gates — has outlined specific ways he’s using AI, which could give you additional ideas. Other AI experts have advice as well. Nvidia CEO Jensen Huang has talked on multiple occasions about how he’s personally experimenting with AI. OpenAI president Greg Brockman has offered advice on honing your AI prompting skills. No one knows exactly how the AI revolution will play out, or even the best way to start to prepare. But even the skeptics should probably heed Mark Cuban’s words and admit that AI is going to change the world. If you stick your head in the sand, you’re doomed. Better start experimenting today so you can be prepared however this thing plays out. EXPERT OPINION BY JESSICA STILLMAN @ENTRYLEVELREBEL

Monday, September 8, 2025

Is the AI Bubble Too Big to Fail?

On Wednesday, analysts bemoaned Nvidia’s lackluster Q2 earnings. The company posted a 56 percent gain in sales, its smallest in more than two years, despite the chipmaker’s positioning as one of the biggest winners of the AI boom. The company’s inability to live up to its expectations has reignited fears of an AI bubble on the precipice of rupture. Despite Silicon Valley throwing hundreds of billions of dollars into its most speculative gamble yet, the revolutionary promises, and more important, profits, of AI have yet to materialize. OpenAI is expected to lose money this year, even as its revenue exceeds a projected $20 billion. Meta’s CFO told investors, “We don’t expect that the genAI work is going to be a meaningful driver of revenue this year or next year,” despite the company dropping upwards of $70 billion on its AI investments this year. A recent MIT study found that U.S. companies have invested between $30 billion and $40 billion into generative AI tools but are seeing “zero return” from AI agents. Some fear that all of this could presage a collapse bigger than the dot-com bust of the early 2000s. As Apollo Global Management’s chief economist warned in a recent investor’s note, big tech firms are driving the market with valuations more bloated than they were in the 1990s. This would be scary for big tech companies—except many of them, according to several researchers who spoke to Inc., are already too big to fail, thanks to how closely the industry has become intertwined with our economy and government. The leading AI companies believe “the only way for this technology to exist is to be as big as possible, and the only way for it to get better is to throw more money at it,” says Catherine Bracy, CEO of the policy and research organization Tech Equity. That need for money and investment has spurred an industry lobbying blitz, pushing everyone from OpenAI CEO Sam Altman to VCs like Andreessen Horowitz into the halls of Congress over the past couple of years. Just earlier this week, The Wall Street Journal reported that Andreessen Horowitz and OpenAI are behind a nascent lobbying campaign through a super PAC network that’s already amassed $100 million to elect AI-friendly candidates. Those beltway relationships appear to be paying off. Currently, more than 30 states offer tax incentives for data center construction. But the booming growth of the industry has been enormously costly, largely owing to the vast amounts of energy needed to run large language models. The Trump administration’s AI Action Plan frames the industry’s growth as essential to “human flourishing” in the U.S. and the country’s continued geopolitical dominance. “We’re now locked into a particular version of the market and the future where all roads lead to big tech,” says Amba Kak, co-executive director of the AI Now Institute, which studies AI development and policy. Indeed, the success of major stock indexes—and perhaps your 401(k)—is resting on the continued growth of AI: Meta, Amazon, and the chipmakers Nvidia and Broadcom have accounted for 60 percent of the S&P 500’s returns this year. But ultimately, in the event of a market reckoning, it’s likely that the biggest companies would remain relatively unscathed. “AI is too big to fail in the United States, both because of how intertwined it has become with the government, and also because of how much AI investment is propping up the stock market and the entire economy,” says Daron Acemoglu, an economist at MIT. When the bubble pops, it’s likely going to be the smallest AI businesses, those riding the AI hype train with products based on existing LLMs, that’ll get wiped out in an eventual rupture. “Those little companies are not going to get bailed out,” he argues. Hardware companies like Nvidia or big tech firms, with diverse revenue streams, are likely to be better insulated from the potential fallout of the bubble popping. As Timnit Gebru, a former Google AI researcher and founder of the Distributed AI Research Institute, puts it, a chipmaker like Nvidia is essentially just selling shovels during a gold rush. “Shovels are still useful with or without the gold rush,” she says. BY SAM BLUM @SAMMBLUM

Friday, September 5, 2025

Why Google’s New AI Image Generator Could Give OpenAI a Run for Its Money

Google just dropped a major update for its AI image generation tech, enabling anyone to generate images with more accurate outcomes. In a blog post, Google revealed Gemini 2.5 Flash Image (also called nano-banana), its latest and greatest AI model for generating and editing images. Google says the new model gives users the ability to blend multiple images into a single image, maintain character consistency across multiple generations, and make more granular tweaks to specific parts of an image. One of the model’s new features is that ability to maintain character consistency, meaning that if you create a specific look for an AI-generated character, the character will maintain that look each time you generate a new image featuring them. “You can now place the same character into different environments,” Google wrote, “showcase a single product from multiple angles in new settings, or generate consistent brand assets, all while preserving the subject.” Gemini 2.5 Flash Image can also make more granular edits to images, like blurring a background, and changing the color of an item of clothing. Another major feature is the ability to fuse multiple images into a single image. Google says this could let people place an object into a room or to restyle an environment with a new color scheme or texture. To demonstrate, Google built a demo in which users can upload a picture of a room, upload images of products that they’d like to see in the room, and then drag the product image to the specific place where they want it to appear in the room. It’s not difficult to imagine people using this feature to see how a new appliance or piece of furniture will look in their home before committing to a purchase. Google also says that Gemini 2.5 Flash Image is particularly adept at sticking to visual templates, such as real estate listing cards, uniform employee badges, and trading cards. This kind of feature could also be used to create thumbnails for YouTube videos. Gemini 2.5 Flash actually debuted on website LMArena last week under the codename nano-banana. LMArena is a platform for evaluating an AI’s performance against other AIs, and big artificial intelligence companies often submit their new models to the site before publicly revealing them. Also of note is Gemini 2.5 Flash Image’s API price. According to Google, the model is priced at $30 per one million output tokens. In comparison, OpenAI’s image-generation API fees cost $40 per one million output tokens, making Google’s offering significantly cheaper. The new model can be used in the Gemini app and in Google AI Studio. BY BEN SHERRY @BENLUCASSHERRY

Wednesday, September 3, 2025

Mark Cuban Says Young People Should Learn This Crucial AI Skill

Legendary investor Mark Cuban has some advice for college students looking to break into the red-hot AI industry: become an AI integrator. During a livestreamed interview on TBPN (the Technology Business Programming Network), Cuban told hosts John Coogan and Jordi Hays that young people in college should learn everything they can about how to integrate AI within corporations, particularly within small to medium-size businesses. Cuban claimed that “every single company” needs professionals with AI implementation skills because there currently aren’t any intuitive ways for corporations to integrate AI into their work. “There are 33 million companies in this country,” Cuban said, and only a select few have dedicated AI budgets or keep AI experts on payroll. But these companies will still need to adapt for the AI era. Cuban likened this issue to how he started his career as an entrepreneur. “When I was 24,” Cuban said, “I was walking into companies who had never seen a PC before in their lives and explaining to them the value.” Cuban said he would meet with the owners of these companies and present them with customized plans that used computers to fulfill their specific business needs. “This is where kids coming out of college are really gonna have a unique opportunity,” said Cuban. Students spending their senior years “learning the difference between Sora and Veo [two popular AI video-generation tools],” or learning how to customize an AI model, will be able to walk into any business and identify clear areas where AI implementation would meaningfully impact their operations. TBPN co-host Coogan agreed with Cuban’s take, and added that he and Hays hired two interns this summer “because they just built products. Instead of saying, ‘Here’s what I can do,’ they just showed us. They took a day and just built something.” Meanwhile, trying to work at one of the big tech companies with a computer science degree is “probably not the right way to go,” says Cuban. Instead, he says, “go into any other company that has no idea about AI but needs it to compete. There’ll be more jobs than people for a long, long time.” BY BEN SHERRY @BENLUCASSHERRY

Monday, September 1, 2025

Why Companies Are Offering Young Workers With AI Skills 6-Figure Salaries

While the entry-level job market on the whole is still hurting, recent graduates who possess AI skills are finding sizable demand for their services. And starting salaries can reach up to hundreds of thousands of dollars per year. A new report by hiring firm Burtch Works finds that the starting salary of AI-skilled workers with zero to three years’ of work experience now averages $131,139—a 12 percent jump from the year prior. Data scientists with the same level of limited experience are averaging $109,545 a year. Compensation levels vary slightly by industry, the report found, but the mean salary for all covered industries with zero to three years’ experience was in the six-figure range. Health care/pharma is currently paying the most to AI-fluent workers, with a mean salary of $123,804. Consulting and tech are at a virtual tie at the bottom of the list, at roughly $104,500. “AI professionals still command a 9 to 13 percent cash premium over data scientists. The gap is widest where scarce [generative AI] expertise adds the most value,” Burtch Works wrote in its report. “If you’re seeking a job in AI and data science, quantify your genAI successes to demonstrate your skills in action [and] reference market data during salary negotiations.” The current demand for AI knowledge is unprecedented. Job search site Indeed earlier this year said the number of postings for generative AI-related jobs had tripled between January 2024 and January 2025. That followed a 75X increase from April 2022 to April 2024. New college graduates are not just digital natives, they’re often AI natives, having grown up with early versions of the technology and learning as it has evolved. That can make them a more natural fit for AI-themed jobs than more experienced workers, who may be more resistant to adopting the technology, in part because of fears it will make their jobs irrelevant. That has led to a bidding war for AI-savvy graduates. OpenAI is reportedly offering a base salary of $167,000, with more than $80,000 in stock options, to entry-level workers, bringing its average compensation to $248,000, according to Levels.fyi, a compensation-data provider. Scale AI reportedly has a total starting compensation package average of $185,000, and Databricks is offering $235,000. Within a couple of years, those numbers nearly double, per the Levels.fyi data. Several dozen users of Levels.fyi have claimed to have received offers of over $1 million from AI companies, with some of them having less than a decade of experience. At the same time, the number of AI job openings has soared. A study released in January by job tracking firm LinkUp and the University of Maryland found that from the beginning of 2018 to the end of 2024, the number of overall job openings was down 17 percent and total IT job openings fell by 27 percent. AI job openings, however, saw a 68 percent increase. Demand for AI skills has become so intense that many hiring managers say they would consider bringing aboard an inexperienced worker with AI expertise versus a more experienced employee. And 66 percent of those managers said they wouldn’t hire someone who lacked AI skills, according to the 2024 Annual Work Trend Index by Microsoft and LinkedIn. BY CHRIS MORRIS @MORRISATLARGE

Friday, August 29, 2025

How to Get Your Money’s Worth on Workplace AI Tools

Critics and skeptics of artificial intelligence technologies have repeatedly denounced the rising buzz the platforms have generated over the past few years, often deriding it as unfounded hype that ignores apps’ current productivity limitations. Now, a new study from MIT largely supports those doubters, finding that a whopping 95 percent of businesses that have adopted AI have thus far gotten zero return on their investment. That was the headline takeaway from a report by MIT Media Lab’s Project NANDA, which was based on survey results and face-to-face interviews with hundreds of senior U.S. business leaders and employees. Despite the study’s estimate that companies have spent $30 billion to $40 billion developing or purchasing AI platforms in the past two years alone, it said only 5 percent of those firms have reported any return on that investment. “The vast majority remain stuck with no measurable (profit or loss) impact,” it said. Similarly, only two of the eight sectors examined — technology, and media and telecom — reflected any significant changes based on the use or performance of AI. “The outcomes are so starkly divided across both buyers (enterprises, mid-market, SMBs) and builders (startups, vendors, consultancies) that we call it the GenAI Divide,” the report’s authors wrote. “The core barrier to scaling is not infrastructure, regulation, or talent. It is learning. Most GenAI systems do not retain feedback, adapt to context, or improve over time.” Why AI is falling short Participating business executives in NANDA’s The GenAI Divide: State of AI in Business 2025 report offered two main reasons for the tech falling far short of expectations so far. On the development side, it said only 5 percent of tools designed to fulfill specific company needs or business functions ever reach production. The rest remain stranded on the shoals of ambitious ideas that never sail beyond the drawing board, despite developer promises that they’re speeding toward completion. “We’ve seen dozens of demos this year,” said one unidentified chief information officer during a NANDA interview. “Maybe one or two are genuinely useful. The rest are wrappers or science projects.” That, in turn, means many companies are instead using more generalist AI tools like ChatGPT or Copilot. While those tend to be effective in automating repetitive workplace grunt chores like research, text composition, or marketing work, they fail to generate significant increases of key metrics like productivity, customer acquisition, or profits. As a result, study respondents said most of the previous and current excitement over AI has not been matched by the revolutionary results that its boosters say it will deliver. “The hype on LinkedIn says everything has changed, but in our operations, nothing fundamental has shifted,” said one midmarket chief operating officer quoted in the study. “We’re processing some contracts faster, but that’s all that has changed.” The study identified two additional divides in AI use by businesses. The first was that more than 80 percent of organizations have tested or piloted apps, with about half of those saying those are now being used in workplaces regularly. Startups, small companies, and midmarket businesses were found to be the fastest in that transition. But the vast majority of that experimentation and integration involved platforms like ChatGPT or other general performance AI bots. While those do often help increase individual employee productivity on certain tasks, study participants said, those gains tend to plateau fairly fast because the apps can’t extend them higher. “ChatGPT’s very limitations reveal the core issue behind the GenAI Divide: it forgets context, doesn’t learn, and can’t evolve,” the study said. As a result, human employees still need to oversee the tech’s results and pursue myriad business objectives that apps can’t. But survey participants who faulted the limitations of general apps were even harsher with AI created for and tailored to their companies or specific business applications. “The same users were overwhelmingly skeptical of custom or vendor-pitched AI tools, describing them as brittle, overengineered, or misaligned with actual workflows,” the report said. “They expect systems that integrate with existing processes and improve over time. Vendors meeting these expectations are securing multi-million-dollar deployments within months.” How to make AI fit your business So how can employers adopt AI into their operations without winding up on the wrong side of the divide? For starters, the NANDA study urges companies to build their own AI platforms whenever possible. Those apps, meanwhile, should be based on their particular business needs, which should enable them to provide better outcomes than generalist tools. When necessary, employers can turn to outside providers to design solutions for their specific uses. The report’s authors also advise businesses to allow managers, and even team leaders, to decide the best ways of deploying apps to get the desired results, rather than having the tech department designate a one-size-fits-all use. Over time, executives should also base evolving AI deployment on where it is creating the most profitable gains. “The highest-performing organizations report measurable savings from reduced (business process outsoursing) spending and external agency use, particularly in back-office operations,” the authors wrote. ”Others cite improved customer retention and sales conversion through automated outreach and intelligent follow-up systems.” And finally, employers should base whatever AI platforms they ultimately assemble on tech capable of fully integrating information it acquires during use, and continually evolve and improve itself with those experiences. “Stop investing in static tools that require constant prompting, [and] start partnering with vendors who offer custom systems, and focus on workflow integration over flashy demos,” the report concludes. “The GenAI Divide is not permanent, but crossing it requires fundamentally different choices about technology, partnerships, and organizational design.” On the bright side Did researchers find any positive aspects to AI’s mega-hype and mini-results so far? Perhaps — at least for employees worried about the tech taking over their jobs. The study determined layoffs linked to AI deployment have been minimal so far, and usually concentrated in companies that have been deploying the tech most. Perhaps unsurprisingly, those firms were often found to be subcontractors handling marketing, communications, and customer service support for other businesses — outsourcing which in future employers may decide to handle in-house using their own apps. BY BRUCE CRUMLEY @BRUCEC_INC

Wednesday, August 27, 2025

Sam Altman Admits the AI Bubble Is Here

In an interview with reporters from multiple publications on Thursday night, OpenAI CEO Sam Altman said he believes the AI sector has entered the territory of a financial bubble. The AI sector has exploded since 2022, largely based on the growth of Altman’s company and its flagship product, ChatGPT. Economists and tech critics have argued recently that the billions of dollars in venture investment in AI companies, and the crush of startups jumping on the AI bandwagon, has been reminiscent of the dot-com bubble and crash of the late 1990s. Altman made the same analogy in his interview with reporters. “When bubbles happen, smart people get overexcited about a kernel of truth,” Altman said, according to The Verge. “If you look at most of the bubbles in history, like the tech bubble, there was a real thing. Tech was really important. The internet was a really big deal. People got overexcited.” He added that “someone is going to lose a phenomenal amount of money. We don’t know who, and a lot of people are going to make a phenomenal amount of money.” Earlier this week, OpenAI released its newest model, GPT-5, to some negative reviews. The CEO had initially promised the model would offer “PhD-level intelligence” in most tasks. But the issue for many people came down to its tone: Users claimed GPT-5 has a terser and colder temperament than its predecessor, GPT-4o. Altman’s admission that AI is over-valued and in a bubble is significant. The CEO has served as one of the industry’s biggest boosters since the launch of ChatGPT in 2022. But Altman had hinted that the writing was on the wall last week, when he told CNBC that the term Artificial General Intelligence (AGI)—a milestone for researchers that involves AI that’s equal to or better than humans at most tasks—isn’t a “super useful term.” AGI, he said, is an over-used term that has lost its meaning. “I think the point of all of this is it doesn’t really matter and it’s just this continuing exponential of model capability that we’ll rely on for more and more things,” he told CNBC. Recent reports indicate OpenAI is valued at $300 billion and approaching $20 billion in annual recurring revenue this year. Despite those prolific numbers, the company is yet to turn a profit, the CEO recently confirmed to CNBC. One factor is that the computational power required to run Large Language Models built by OpenAI and its competitors is notoriously expensive. Warnings of an AI bubble bursting are not new. In 2023, the venture capitalist Jason Corsello, CEO and general partner of Acadian Ventures, told Inc: “This area of A.I. is somewhat overhyped. It’s over-invested, it’s overvalued. When you’re seeing seed-stage companies raise between $100 million and $150 million with nothing more than a pitch deck, that’s a bit concerning.” OpenAI, for its part, is currently seeking a $500 billion valuation through a tender offer for current and former employees. BY SAM BLUM @SAMMBLUM

Monday, August 25, 2025

Anthropic Is Making It Easier to Learn How to Code

Anthropic’s Claude is getting a side gig as a tutor. The company has launched new modes for its two consumer-facing platforms, Claude.ai and Claude Code. The modes will enable Claude to not just answer questions and write code, but also structure its outputs to teach users through a process Anthropic refers to as guided discovery. The company originally released a learning mode in April, but it was available only to university students and faculty with Claude for Education memberships. In late July, OpenAI released a similar feature for ChatGPT called study mode. Using Claude.ai, Anthropic’s ChatGPT-like website and mobile app for casually interacting with its AI models, users will be able to enable the learning mode by selecting it from a dropdown menu of various styles. According to Anthropic, Claude will use a “Socratic approach” to guide users through challenging concepts instead of immediately giving answers. If you’re a student using Claude to help with your homework or studying, this could be a useful feature. Beyond that option, Claude Code, Anthropic’s tool for software development with AI, will feature two learning modes. Anthropic’s models are famed for their coding ability, and have given rise to a generation of startups pioneering a new method of software engineering called vibe coding. These new learning modes in Claude Code are designed to help developers learn more about the fundamentals of software engineering while building applications with Claude. The first new mode in Claude Code is called Explanatory. When people use it, Claude will explain why it made certain decisions while coding, recreating the dynamic of a senior developer narrating their thought process to a junior developer. When in the second mode, referred to as Learning, Claude will actually leave key sections of the code undone and direct human developers to fill in those sections themselves. Once a user fills it in, Claude will judge the code and give feedback. People can put both Claude Code learning modes to work by updating Claude Code, running /output-styles in your terminal, and selecting between Default, Explanatory, and Learning styles. BY BEN SHERRY @BENLUCASSHERRY

Friday, August 22, 2025

As AI Agents Fill the Workplace, Their Human Colleagues Stay Wary

As we wait for the promised evolution of AI to artificial general intelligence (AGI) which promises capabilities on par with human workers, the most sophisticated AI tools on market are AI agents. These semi-autonomous systems can make certain decisions by themselves and even carry out actions usually done by people in a digital environment. In January, OpenAI’s Sam Altman said AI agents could transform the workplace in 2025. With the year more than half over, is he right? New data from business leaders says maybe yes. But workers? They don’t trust ‘em. A survey of employees from around the world by California-based HR software firm Workday found upbeat results when it came to how workers feel about using buzzy AI agent tech. Amazingly, three-quarters of the survey respondents said they felt comfortable interacting with AI agents at work, news site ZDNet reported. That’s a really high comfort level with what is very much a breakthrough innovation. The numbers tell a very different story when it comes to taking orders from an AI agent, however. Only 30 percent of respondents said they’d be comfortable being bossed by a digital “colleague.” Just 24 percent of people felt okay with the idea of running agents inside a company without a human monitoring the situation. ZDNet noted parallels between this outcome and recent research from Stanford University which found a certain level of trust of AI agents, but only for very basic tasks. Anecdotally, this should be reassuring. In June, two researchers from leading AI firm Anthropic warned that they could foresee a future where AIs make decisions and employees would have to blindly follow them as a kind of “meat robot.” The fact that so few people would blindly follow an agent’s instructions without at least applying a smidge of critical thinking should be reassuring. Trusting the tools of the workplace is a critical — we’ve all used the “good” printer in the office when we needed an urgent copy of an important report, rather than relying on the nearby one. It seems the same is true of AI agents. People are happy to embrace them for simple tasks, but are much more wary about following critical decisions made by an AI tool. But trust builds over time, and Workday’s data found evidence of this in attitudes to AI, because as employees work more with agents, the more they trust the system’s outputs. Part of this trust may come from the fact that 90 percent of the survey respondents said they felt AI agents would boost productivity — any tool that’s that useful can’t be bad, can it? But even here, it seems workers are already quite savvy about the risks of AI systems, with many respondents to the survey concerned that overreliance on AI tools could lead to slips in their own critical thinking, a workplace built less around human interactions, and that the productivity boost from the AI may tempt managers to up their demands. Another worry workers have about agent AIs that may play into considerations of employee trust is that the new technology may steal their jobs. This worry may be borne out, as indicated by a recent report about the advertising industry, which shows that the industry is actually dumping entry-level workers. Data show 6.5 percent of all jobs in the industry were held by people aged 20 to 24. In 2019, that cohort represented 10.5 percent of the ad industry. AI’s role in this decline can’t be ignored, industry news site AdWeek contends. Why should you care about this? Because you may have rolled out agent-based AI tools to your workforce, and then sat back — confident in your employees’ ability to make the most of this smart tech, and reap the benefits of all that extra productivity. The reality may be slightly different. It may be worth running an audit of how comfortable your workers are with this tech, and also educating them about how you would actually like them to use these AI agents. Reassuring them that you won’t replace them with a pile of silicon chips may also be a good idea. BY KIT EATON @KITEATON

Wednesday, August 20, 2025

BUILDING AI FACTORIES

Imagine a place where innovation meets industrialization, where AI is not just a concept but a reality, where raw data is transformed into actionable intelligence at lightning speed. That’s an AI factory, an environment designed to manage the entire AI lifecycle — from data pipelines and model training to inference and real-time insights. With purpose-built infrastructure, integrated tools, scalable operations, and unparalleled AI expertise, the AI factory can revolutionize the way you harness the power of artificial intelligence. Think of it like a traditional factory, but instead of producing physical goods, it creates value and intelligence from data. AI factories take in raw data, process it through AI models, and output actionable intelligence, predictions, or new AI solutions. The journey from data to intelligence is streamlined, efficient, and groundbreaking. The result? Faster innovation, operational efficiency, scalability, and greater control over data and business outcomes. Why do you need an AI factory? Because operationalizing AI can be challenging. As organizations embrace AI’s transformative potential, they face a range of complexities inherent in fully operationalizing AI. These challenges include: — Complex AI workloads: Managing diverse and resource-intensive AI workloads can overwhelm existing infrastructure, leading to inefficiencies and delays. — Need for multitenancy: Efficiently managing multiple tenants and their resources is complex and resource-intensive, leading to potential conflicts and inefficiencies. — High costs of cloud AI: The expenses associated with deploying AI solutions in the cloud can be prohibitive, impacting budget and ROI. AI is iterative, and models can degrade over time due to data drift, changing customer behavior, and environmental shifts. To maintain relevance and performance, a high-performing AI factory infrastructure is essential for retraining models, conducting simulations, monitoring inference quality, and managing deployment pipelines for continuous improvements.

Monday, August 18, 2025

The Vibe-Coding Companies and Founders to Watch in 2025

In a blog post published in early January, OpenAI CEO Sam Altman opined that in 2025, the first AI agents would enter the workforce and materially change the output of companies. Eight months into the year, it’s arguable that he’s been proven entirely correct. That’s because AI agents are the key element behind the explosive rise of vibe coding, a term coined in February 2025 by famed AI researcher and OpenAI cofounder Andrej Karpathy to refer to the act of writing and editing code with assistance from an AI system. Karpathy posted on X that “there’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.” Karpathy’s post ushered in a new chapter in the AI era, with experienced developers adopting AI coding tools en masse (a recent survey from Stack Overflow found that 80 percent of developers now use AI tools in their workflows) and programming neophytes creating their first apps. This interest has sent several startups’ revenues into the stratosphere, in some cases 10xing revenue in a matter of months. And that attention isn’t just coming from customers, but investors too. This year, companies in the vibe coding space have raised billions from venture capital firms. Unlike many of the anticipated use cases for AI, vibe coding is already fundamentally changing how people interact with computers, and as such has become a focal point for the AI revolution. These are the companies and people shaping the world of vibe coding: Anthropic Anthropic has played a crucial role in the success of the vibe coding industry. The Dario Amodei-led company’s Claude AI models have proven supremely adept at handling programming tasks, and are the preferred models for many of the major players in the vibe coding space. In addition to powering other companies’ platforms, Anthropic also produces multiple vibe coding applications of its own. The first are Artifacts, which are interactive applications that can be created in Claude.ai, Anthropic’s consumer-facing platform for using its AI models. Artifacts are only meant to serve as prototypes, and Anthropic warns that the software created with Artifacts are not production-ready. Anthropic’s other vibe-coding product, Claude Code, is a program that connects directly to a user’s computer terminal, and is meant for developing full-fledged pieces of software that can be deployed and used by many people. Anysphere Anysphere, the organization behind the wildly popular AI-powered code editor Cursor, refers to itself as the fastest-growing startup in history. In early June 2025, Anysphere announced that it had raised $900 million at a $9.9 billion valuation. That’s a big jump from just six months earlier in January, when the startup raised $105 million at a $2.5 billion valuation. Anysphere was founded in 2022 and released Cursor in late 2023. Within 12 months of its launch, according to Bloomberg, Cursor was bringing in over $100 million in annual recurring revenue. Cursor uses AI models from other companies to power its code editor, and is reportedly one of Anthropic’s top customers. Like Cognition’s Devin, Cursor is known for helping software developers achieve a coding flow state. Cognition and Windsurf Cognition was one of the first companies to get into the world of AI-powered coding. The company was founded by a group of young, award-winning coders who developed a powerful software development assistant called Devin in late 2023. Devin was among the first AI-powered applications to be capable of developing an entire piece of software with nothing but a prompt to get it started. Since Devin was revealed in early 2024, Cognition has raised hundreds of millions of dollars, and is now reportedly in talks to raise over $300 million from investors at a $10 billion valuation. And now, Cognition is the owner of a former rival: Windsurf. Windsurf was originally founded as Codeium in 2021 by a pair of MIT graduates who wanted to make GPU workloads more efficient to process, but the company pivoted in 2022 after witnessing the rise of generative AI tools like OpenAI’s Dall-E. The company found success in shipping AI-powered coding extensions, and then in November 2024 released the Windsurf Editor, a virtual development environment with agentic AI built in. Windsurf immediately took off with experienced coders, who enjoyed the editor’s ability to put engineers in a kind of flow state, in which they can seamlessly work in tandem to quickly create new software. In May 2025, OpenAI reportedly had made a deal to acquire Windsurf for $3 billion, but that deal fell apart due to stipulations in OpenAI’s deal with Microsoft. Once the exclusive negotiating period had ended, Google quickly swiped the CEO and dozens of top employees. After a frantic weekend of dealmaking, Cognition announced that it would buy Windsurf, but keep it as a separate entity, with all remaining employees sticking around for the transition. Jack Dorsey The founder of Twitter and Block CEO has also been getting in on the vibe coding fun. Last month, Dorsey announced that he had used a Block-developed AI agent to create a new app, called Bitchat, that puts a twist on traditional social media. Bitchat is a peer-to-peer messaging app that uses Bluetooth to enable wireless messaging without needing an internet connection. The app essentially uses the web of connections made by Bluetooth-capable devices to create a working network. However, people quickly started pointing out potential flaws in Bitchat’s design, potentially due to its vibe-coded nature. Lovable Hailing from Sweden, Lovable is one of the few European stars of the AI revolution. Founded in 2023, Lovable is a platform that enables people of any skill level to create fully-functioning websites from natural-language prompts. According to Lovable CEO and co-founder Anton Osika, Loveable is meant to be “the last piece of software that anyone has to write.” In late July, Lovable announced that it had passed $100 million in annual recurring revenue, only eight months after making their first $1 million. This, according to the company, makes Lovable the actual fastest growing startup in the world, outpacing Cursor. The company also announced that over 100,000 projects are now being built on Lovable each day. Microsoft Microsoft’s Github Copilot is an AI-powered coding assistant built in tandem with ChatGPT creator OpenAI. Like Cursor, Github Copilot allows users to choose between various models to handle specific coding challenges and is designed for professional developers rather than novices. Still, Microsoft has been adding more agentic capabilities recently, enabling Copilot to handle more coding tasks by itself, rather than just editing small snippets at a time. On Microsoft’s most recent quarterly earnings call, CEO Satya Nadella shared that Github Copilot had hit over 20 million lifetime users, up from 15 million in April, and is now being used by 90 percent of the Fortune 100. Replit Replit is one of the older startups on this list—it was founded way back in 2016. Originally, Replit was Repl.it, a cloud-based coding environment that could be accessed from anywhere. In essence, Repl.it was like Google Docs for coding, a web-based app that enabled multiple people to collaborate on a coding project at once. In September 2024, Replit released Replit Agent, a new feature that enabled users to describe an application or piece of software in natural language, and then send an AI agent off to plan, code, and deploy the app. Replit Agent was an instant hit, and was so successful that it fundamentally changed the trajectory of the company. Once focused on catering to professional and skilled coders, Replit is now fully embracing the casual audience. In a January interview with Semafor, Replit CEO Amjad Masad even said that “we don’t care about professional coders anymore.” In the roughly 9 months since Replit Agent was launched, Masad says Replit’s annual recurring revenue has exploded from $10 million to $100 million. Theo Browne A former software development engineer for Twitch, Theo Browne has emerged as one of the most notable influencers in the fast-moving world of vibe coding. Browne releases multiple YouTube videos per week in which he gives his perspective on the latest AI headlines and experiments with new vibe coding platforms. Browne’s most popular videos include tutorials, tier lists, and comparisons between popular tools. Browne is also the founder of Ping Labs, a startup developing AI-powered tools. BY BEN SHERRY @BENLUCASSHERRY

Friday, August 15, 2025

North Korean Hackers Are Using AI to Get Jobs at U.S. Companies and Steal Data

Cyberattacks are getting faster, stealthier, and more sophisticated—in part because cybercriminals are using generative AI. “We see more threat actors using generative AI as part of their tool chest, and some of those threat actors are using it more effectively than others,” says Adam Meyers, head of counter adversary operations at CrowdStrike. The cybersecurity tech company released its 2025 Threat Hunting report on Monday. It detailed, among other findings, that adversaries are weaponizing genAI to accelerate and scale attacks—and North Korea has emerged as “the most GenAI-proficient adversary.” Within the past 12 months alone, CrowdStrike investigated more than 320 incidents in which operators associated with North Korea fraudulently obtained remote jobs at various companies. That represents a jump of about 220 percent year-over-year. The report suggests operatives used genAI tools “at every stage of the hiring and employment process” to automate their actions in the job search through the interview process, and eventually to maintain employment. “They use it to create resumes and to create LinkedIn personas that look like attractive candidates you would want to hire. They use generative AI to answer questions during interviews, and they use deep fake technology as well during those interviews to hide who they are,” Meyers says. “Once they get hired, they use that to write code to allow them to hold 10, 15, 20, or more jobs at a time.” In late July, an Arizona woman, Christina Chapman, was sentenced to eight years in prison for her role in assisting North Korean workers in securing jobs at more than 300 U.S. companies; that generated an estimated $17 million in “illicit revenue,” according to the Department of Justice. In late 2023, some 90 laptops were seized from her home. North Korean fraudsters, however, aren’t the only threat facing businesses, academic institutions, and government agencies. “We’re seeing more adversary activity every single day,” Meyers says. “There are more and more threat actors engaging in this, and it’s not just criminals or hacktivists. We’re also seeing more nation states.” Although North Korea’s attacks may be among the most attention-grabbing, Meyers says “China is probably the number-one threat out there for any Western organization.” In the past year, CrowdStrike noted a 40 percent jump in cloud intrusions that it attributed to China-related adversaries. Cloud intrusions overall jumped about 136 percent in the first half of 2025, versus all of the previous year, according to the report. Although the tech industry is the most targeted industry overall, Chinese adversaries substantially ramped up attacks on the telecom sector within the past year, according to the report. “The telecommunications sector is a high-value target for nation-state adversaries, providing access to subscriber and organizational data that supports their intelligence collection and counterintelligence efforts,” the report states. As technology becomes more sophisticated, it may seem overwhelming for organizations trying to keep attackers at bay. Meyers counseled individuals on security teams to make use of those very same tools that bad actors are using to fight back. “Generative AI was being used by these threat actors, but it could also be used by the good guys to have more effective defenses,” he says. “We have that capability in some of [CrowdStrike’s] products, but you can use generative AI to kind of scale up those capabilities within the security team.” He also recommended organizations be proactive, rather than reactive to threats. “If you wait for bad stuff to show itself, it’s going to be too late,” he says. “Probably one of the biggest takeaways is that you need to have threat hunting.” Just over a year ago, a CrowdStrike update precipitated what has since been called one of history’s biggest IT failures. A buggy security update caused Windows devices to crash, affecting a broad swathe of companies in banking, health care, and aviation, among others. Delta Air Lines was notably affected and is suing CrowdStrike, alleging the outage caused as many as 7,000 flight cancellations and as much as $550 million in lost revenue and other expenses, Reuters reported. BY CHLOE AIELLO @CHLOBO_ILO

Wednesday, August 13, 2025

This Female-Led AI Company Helps Fix Manufacturing Problems in Real Time—or Before They Happen

SixSense is using AI to shore up semiconductor production—and the female-founded startup just raised $8.5 million to do it. SixSense is developing “factories that think” to bring what it calls “intelligent automation” to the incredibly complex and important semiconductor industry, according to its website. What this means in practice is that the company’s AI platform leverages data to catch issues early, improve output, and increase control over production. The Singapore-based SixSense was co-founded in 2018 by CEO Akanksha Jagwani and CTO Avni Agarwal. With a background in mechanical engineering, Jagwani leads business development and efforts to partner with semiconductor fabrication plants to deploy SixSense’s AI. Major semiconductor makers including GlobalFoundries and JCET already use SixSense’s technology, according to TechCrunch. Agarwal leverages her background in computer engineering to lead the company’s tech and product vision. “We’re already working with fabs in Singapore, Malaysia, Taiwan, and Israel, and are now expanding into the U.S.,” Agarwal told TechCrunch. Although SixSense is based in Singapore, in the U.S. at least there is still a significant disparity in VC funding for women-led companies. According to data from Pitchbook, women-only teams secured roughly 2 percent of VC deal value in 2024, whereas companies with both a female and male co-founder secured about 22 percent that year. There is also indication that women are growing and advancing at VC firms themselves. Women now occupy close to 19 percent of leading investor roles in firms across the U.S., The Wall Street Journal reported. At “mega venture firms,” which manage $3 billion or more, only about a dozen managing partners are women. SixSense’s latest round of funding brings its total to about $12 million, TechCrunch reported. Peak XV’s Surge seed platform led the round with participation from Alpha Intelligence Capital, FEBE, and more, according to TechCrunch. BY CHLOE AIELLO @CHLOBO_ILO

Saturday, August 9, 2025

OpenAI launches GPT-5 as AI race accelerates

OpenAI has launched its GPT-5 artificial intelligence model, the highly anticipated latest installment of a technology that has helped transform global business and culture. OpenAI's GPT models are the AI technology that powers the popular ChatGPT chatbot, and GPT-5 will be available to all 700 million ChatGPT users, OpenAI said. The big question is whether the company that kicked off the generative AI frenzy will be capable of continuing to drive significant technological advancements that attract enterprise-level users to justify the enormous sums of money it is investing to fuel these developments. The release comes at a critical time for the AI industry. The world's biggest AI developers - Alphabet, Meta, Amazon and Microsoft, which backs OpenAI - have dramatically increased capital expenditures to pay for AI data centers, nourishing investor hopes for great returns. These four companies expect to spend nearly $400bn (€342bn) this fiscal year in total. OpenAI is now in early discussions to allow employees to cash out at a $500bn (€428bn) valuation, a huge step-up from its current $300bn (€257bn) valuation. Top AI researchers now command $100m (€85m) signing bonuses. "So far, business spending on AI has been pretty weak, while consumer spending on AI has been fairly robust because people love to chat with ChatGPT," said economics writer Noah Smith. "But the consumer spending on AI just isn't going to be nearly enough to justify all the money that is being spent on AI data centres," he added. OpenAI is emphasizing GPT-5's enterprise prowess. In addition to software development, the company said GPT-5 excels in writing, health-related queries, and finance. "GPT-5 is really the first time that I think one of our mainline models has felt like you can ask a legitimate expert, a PhD-level expert, anything," OpenAI CEO Sam Altman said at a press briefing. "One of the coolest things it can do is write you good instantaneous software. This idea of software on demand is going to be one of the defining features of the GPT-5 era," he added. In demos yesterday, OpenAI showed how GPT-5 could be used to create entire working pieces of software based on written text prompts, commonly known as "vibe coding". One key measure of success is whether the step up from GPT-4 to GPT-5 is on par with the research lab's previous improvements. Two early reviewers said that while the new model impressed them with its ability to code and solve science and math problems, they believe the leap from the GPT-4 to GPT-5 was not as large as OpenAI's prior improvements. Even if the improvements are large, GPT-5 is not advanced enough to wholesale replace humans. Mr Altman said that GPT-5 still lacks the ability to learn on its own, a key component to enabling AI to match human abilities. On his popular AI podcast, Dwarkesh Patel compared current AI to teaching a child to play a saxophone by reading notes from the last student. "A student takes one attempt," he said. "The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student. This just wouldn't work," he said. More thinking Nearly three years ago, ChatGPT introduced the world to generative AI, dazzling users with its ability to write humanlike prose and poetry, quickly becoming one of the fastest growing apps ever. In March 2023, OpenAI followed up ChatGPT with the release of GPT-4, a large language model that made huge leaps forward in intelligence. While GPT-3.5, an earlier version, received a bar exam score in the bottom 10%, GPT-4 passed the simulated bar exam in the top 10%. GPT-4's leap was based on more compute power and data, and the company was hoping that "scaling up" in a similar way would consistently lead to improved AI models. But OpenAI ran into issues scaling up. One problem was the data wall the company ran into, and OpenAI's former chief scientist Ilya Sutskever said last year that while processing power was growing, the amount of data was not. He was referring to the fact that large language models are trained on massive datasets that scrape the entire internet, and AI labs have no other options for large troves of human-generated textual data. Apart from the lack of data, another problem was that 'training runs’ for large models are more likely to have hardware-induced failures given how complicated the system is, and researchers may not know the eventual performance of the models until the end of the run, which can take months. At the same time, OpenAI discovered another route to smarter AI, called "test-time compute," a way to have the AI model spend more time compute power "thinking" about each question, allowing it to solve challenging tasks such as math or complex operations that demand advanced reasoning and decision-making. GPT-5 acts as a router, meaning if a user asks GPT-5 a particularly hard problem, it will use test-time compute to answer the question. This is the first time the general public will have access to OpenAI's test-time compute technology, something that Altman said is important to the company's mission to build AI that benefits all of humanity. Mr Altman believes the current investment in AI is still inadequate. "We need to build a lot more infrastructure globally to have AI locally available in all these markets," he said.

Friday, August 8, 2025

AI Can Do a Lot—but Still Seems Totally Stumped by Sudoku

Artificial intelligence chatbots can whip up the code for a website in just a few seconds and summarize the important parts of a 90-minute meeting in moments. But how trustworthy is the technology? High-profile examples of AI hallucinating or gaslighting users have made some people understandably wary. But a group of researchers at the University of Colorado Boulder has come up with an interesting way to test the trustworthiness of the technology: by playing Sudoku. The researchers gave AI models 2,300 six-by-six Sudokus (which are more simple than the nine-by-nine games most humans play). Then it set the AI loose, asking five different models to solve them all—and then asking the models to explain their answers. The AI struggled a bit with the puzzles themselves. ChatGPT’s o1 model, for instance, only solved 65 percent of the puzzles correctly; that’s an older model that was state of the art two years ago (the company introduced o4-mini in April). Other AI systems did even worse. Nobody’s perfect, not even a machine, but things got really interesting when the researchers asked the AI platforms to explain how they chose their answers. “Sometimes, the AI explanations made up facts,” said Ashutosh Trivedi, a co-author of the study and associate professor of computer science at CU Boulder, in a statement. “So it might say, ‘There cannot be a two here because there’s already a two in the same row,’ but that wasn’t the case.” One of the AIs, when asked about Sudoku, answered the question by giving an unprompted weather forecast. “At that point, the AI had gone berserk and was completely confused,” said study co-author Fabio Somenzi, professor in the Department of Electrical, Computer, and Energy Engineering. The hallucinations and glitches, the authors note, “underscore significant challenges that must be addressed before LLMs can become effective partners in human-AI collaborative decision-making.” The o1 model from OpenAI was especially bad at explaining its actions, despite vastly outpacing the other AI models with the puzzles. (The others, the study says, were “not currently capable” of solving six-by-six Sudoku puzzles.) Researchers said its answers failed to justify moves, used the wrong basic terminology, and poorly articulated the path it had taken to solve the puzzle. On a broader scale, the public’s trust in AI has a long way to go. A study by KPMG found that just 41 percent of people are willing to trust AI, even when they’re eager to see its benefits. The World Economic Forum, meanwhile, says trust will shape outcomes in the AI-powered economy, while McKinsey, in March of this year, reported 78 percent of organizations use AI in at least one business function. The Sudoku study was less about whether artificial intelligence could solve the puzzle and more a logic exercise. The focus was to gain insight into how AI systems think. A better understanding of how AI thinks could ultimately improve people’s trust levels and ensure that the results the AI spits out, whether it’s computer code or something to do with your finances, are more reliable. “Puzzles are fun, but they’re also a microcosm for studying the decision-making process in machine learning,” said Somenzi. “If you have AI prepare your taxes, you want to be able to explain to the IRS why the AI wrote what it wrote.” BY CHRIS MORRIS @MORRISATLARGE

Wednesday, August 6, 2025

A Harvard Professor Says This Is How AI Will Shake Up White-Collar Work

Last week, a report from Microsoft — one of the companies most aggressively pushing AI tools out into the world — suggested the top 40 jobs AI will most likely take over in the coming years, as well as the 40 jobs most resistant to the AI invasion. You probably won’t be surprised that telemarketers and translators are at high risk, while more practical roles like nursing assistants and embalmers are at low risk. But in a new report, Christopher Stanton, an associate professor of business administration at Harvard Business School, explained how quickly AI might upset many more white-collar jobs than some people think, and he also worries that there may not be much we can do to stop it. Stanton’s research covers the impact of AI in the workplace, so he knows what he’s talking about, and some of the statistics and opinions he voiced should concern pretty much every leader of any size company. When you look at “the tasks workers in white-collar work can do and what we think AI is capable of,” he explained to The Harvard Gazette, the “overlap impacts about 35 percent of the tasks that we see in labor market data.” Essentially, Stanton thinks that the suite of AI tools that’s already accessible to businesses could replace a human worker in about one in every three tasks typical in the office. Whether companies actually are choosing to do that is an open question, however. Stanton also set out an optimistic case for AI replacing human workers, using his own job as professor as a model. Optimistically, he thinks companies may choose to use AI to automate some jobs and thus “free up people to concentrate on different aspects of a job.” As a professor, you might see “20 percent or 30 percent of the tasks that a professor could do being done by AI, but the other 80 percent or 70 percent are things that might be complementary to what an AI might produce,” he said. Here his words echo numerous other AI proponents’ promises about AI. But when it comes to keeping AI evolution on track — and the expansion of AI has been “probably some of the fastest-diffusing technology around,” Stanton said — this expert has a darker idea. While he admits the jury is still out on whether AI will displace people from whole classes of jobs or not, he does worry that it might upset the entire job market, with many middle-class Americans suddenly out of work, leading to impacts on society. Stanton said he felt politicians “will have a very limited ability to do anything here unless it’s through subsidies or tax policy,” because “anything that you would do to prop up employment, you’ll see a competitor who is more nimble and with a lower cost who doesn’t have that same legacy labor stack probably outcompete people dynamically.” Stanton’s words resonate strongly with the ongoing mainstream debate about the impact of AI, and in particular with actions by Amazon’s CEO Andy Jassy. In a memo to staff recently, Jassy gave a leadership master class about how not to talk about AI, bungling the news that AI would indeed be taking people’s jobs at the retail and internet giant so badly that it triggered emotional staff pushback on Amazon’s internal Slack discussion system, with some workers demanding senior leadership positions should also be under the same AI threat. But last week Jassy took a different tone when he addressed the matter during Amazon’s earnings call, Fortune reported. After reiterating that AI is going to “change very substantially the way we work,” he softened his stance and instead suggested that AI will “make all our teammates’ jobs more enjoyable” since it’ll free them from many “rote” procedures that couldn’t previously be automated. Saying AI will make jobs more “enjoyable” is an interesting turn of phrase, and it does echo recent research by global tech giant HP. The company’s study found seven in 10 workers who use AI say it can make their jobs easier, which may correlate with lower stress and also boosted happiness (which translates to better productivity). But there’s a strong undercurrent to all this research. Stanton, Jassy, and other experts argue that AI will take people’s jobs away … but for the remaining staff, it may make their days smoother. Why should you care about this? If you’re busy planning out how your company will leverage AI tech, the way you explain the initiative to your workers matters. Honest words about helping their daily tasks, and a promise not to overburden them with more work now that they’re benefiting from AI assistance is probably a good idea. BY KIT EATON @KITEATON