Monday, July 8, 2024

Is That AI Safe? Startup Anthropic Will Pay to Check

As the battle between the AI giants heats up, the topic of AI safety is always hovering around in the background--because these ever-smarter tools can be both powerful and incredibly dangerous. It's for this reason that one of the leading AI makers, Anthropic, which makes the AI system Claude, is starting a program that will fund the creation of AI benchmarks, so that we'll all be able to more accurately measure both the smarts and the potential impact of AI systems. Making sure AIs are safe In a blog post, Anthropic explains that "developing high-quality, safety-relevant" evaluations of AI quality and impact "remains challenging, and the demand is outpacing the supply." Essentially as more and more AI systems come online, and the pressure to measure them so that we understand their value and riskiness rises, there aren't enough tools available. To help solve this, Anthropic believes its investment could "elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem." Anthropic's post goes into great detail about the exact qualities it's trying to encourage third-party evaluators to measure. It mentions specifics like the risk AI may pose to cybersecurity, social manipulation (critically important in an election year), national security risks like "defense, and intelligence operations of both state and non-state actors," and even the chance that AIs could "enhance the abilities of non-experts or experts" to create chemical, biological, radiological and nuclear threats. It also says it wants the ability to measure "misalignment," a situation where AIs "can learn dangerous goals and motivations, retain them even after safety training, and deceive human users about actions taken in their pursuit." AI safety is a tricky problem This is high-level stuff, addressing a very difficult problem that has troubled even OpenAI, the industry's current market leader. To keep its own AIs safe, OpenAI formed a superalignment team a while back, after the brief 2023 scandal that saw CEO Sam Altman temporarily removed as some board members worried about the direction he was taking the company. However, the leaders of that team recently left the organization--sparking fresh concerns. One of those executives, Ilya Sustkever, subsequently launched his own startup with the express goal of building safe AIs in a development environment insulated from the financial pressures faced by other AI startups. Anthropic's program to tackle AI safety will involve third parties who've submitted their plans to the company and been selected to develop the relevant AI-measuring tools. Anthropic will then "offer a range of funding options tailored to the needs and stage of each project." As news site TechCrunch points out, the expectation is that these third parties will be building whole AI-assessing platforms that should allow experts to craft their own AI safety assessments, and also involve "large-scale trials of models involving 'thousands' of users." Safety first! But are AIs really that much of a threat? TechCrunch also points out that some of the scenarios Anthropic illustrates in its blog post are a little far-fetched, especially since some high-profile experts, including futurist guru Ray Kurzweil, have suggested that fears that AI represents an existential threat to humans are somewhat overblown. Clearly the better move is to err on the side of caution, though, especially when players with skin in the game--like OpenAI's Altman and entrepreneurial AI-maker Elon Musk--are loudly voicing their concerns about the potential risks from the same AIs they're spending billions to make. The news that even a leading AI maker is worried about the threat the technology poses should give business users pause. We know how useful AI can be to a company--but it's worth reminding your staff that it poses certain risks too, and its output shouldn't be trusted without at least a double-check. Meanwhile, when Inc. asked OpenAI's ChatGPT how easy it was to measure AI safety, it was pretty candid: It admitted it was a tricky job, but then it added "as for me, I'm designed to be a helpful tool. I operate under strict guidelines to ensure I provide accurate, safe, and useful information." It also noted that it was supposed to be an assistant, "not to pose any threat." We're not entirely sure how we feel about the fact it said "me" in that statement. BY KIT EATON @KITEATON

No comments: