Hackers claim that models are also alarmingly quick to hack despite the spotlight being on AI security risks.

image

Hello and a warm pleasant to Eye on AI! In today’s edition…Elon Musk’s xAI produces Grok 3 AI robot, OpenAI CEO teases potential open-source AI venture, South Korea suspends DeepSeek AI robot, and Perplexity offers its own Deep Analysis tool similar to OpenAI’s.

One of the biggest AI aura shifts of 2025 so far is the immediate, massive tilt from AI” protection” to AI” security”.

Since the release of ChatGPT in November 2022, AI safety activists, who usually focus on broad, long-term and often philosophical risks, have held the light. Concerns have been appearing in daily newspapers about the possibility that humans may lose control of AI that aims to endanger people or that rogue nations had use AI to create genetically modified pandemics that would lead to human death. There was a letter from May 2023 that demanded that all AI laboratory “immediately pause for at least six months the education of AI techniques more powerful than GPT-4” and that was signed by 30, 000 people, including Elon Musk. The Biden Administration established the AI Safety Institute as a subsidiary of the modest NIST organization, the National Institute of Standards and Technology, while the United Kingdom established its own AI Safety Institute and held the first of three important AI Safety Summits.

Advertisement

Advertisement

Oh, how times have changed: The head of the U. S. AI Safety Institute, Elizabeth Kelly, has departed, a move seen by many as a sign that the Trump administration was shifting training on AI plan. The AI Action Summit was changed to the next AI Safety Summit, which was held in Paris earlier this month. While U.S. Vice President JD Vance focused solely on AI and regional security, declaring that” we will protect American AI and chip systems from theft and misuse,” the French authorities established a national university to “assess and secure AI.”

Artificial security risks are important

It may seem more urgent and relevant to address the potential for all-powerful AI that could potentially go off the rails than to focus on protecting AI models from those who want to break in. However, the world’s best moral hackers, or those who test methods in order to find and fix weaknesses before malicious hackers can utilize them, state AI security—like standard cybersecurity—is far from easy.

Artificial security risks are not new: A user could deceive an LLM into giving them specific instructions on how to carry out cyberattacks or dangerous activities. In its training set, an AI model may be manipulated to reveal sensitive or confidential information. Meanwhile, self-driving cars could be subtly modified, deepfake videos could spread misinformation, and chatbots could impersonate real people as part of scams.

Hackers from the Def Con security conference, the largest annual gathering for ethical hackers, have warned that it is still far too easy to break into AI systems and tools more than two years after OpenAI’s ChatGPT burst onto the scene. Without a fundamental change in current security practices, they claimed in a recent report, The Hackers ‘ Almanack, which was produced in collaboration with the University of Chicago.

Hackers say’ red-teaming’ is’ BS ‘

At the moment, most companies focus on “red teaming” their AI models. Red teaming means stress-testing an AI model by simulating attacks, probing for vulnerabilities, and identifying weaknesses. The goal is to uncover security issues like the potential for jailbreaks, misinformation and hallucinations, privacy leaks, and “prompt injection “—that is, when malicious users trick the model into disobeying its own rules.

Advertisement

Advertisement

But in the Hackers ‘ Almanack, Sven Cattell, founder of Def Con’s AI Village and AI security startup nbdh. ai, said red teaming is” B. S”. He claimed that the processes used to track down the flaws and vulnerabilities of AI models are flawed. With a technology as powerful as LLMs there will always be “unknown unknowns” that stress-testing and evaluations miss, Cattell said.

Even the biggest businesses can’t imagine and safeguard against every possible application and restriction, he said. ” For a small team at Microsoft, Stanford, NIST or the EU, there will always be a use or edge case that they didn’t think of”, he wrote.

AI security requires cooperation and collaboration

The only way for AI security to succeed is for security organizations to cooperate and collaborate, he emphasized, including creating versions of time-tested cybersecurity programs that let companies and developers disclose, share, and fix AI “bugs”, or vulnerabilities. There is currently no way to report vulnerabilities relating to the unanticipated behavior of an AI model, and there is no public database of LLM vulnerabilities, as there has been for other types of software for decades, as after the Def Con conference last August.

We need to work together, Cattell wrote, “if we want to have a model that can confidently say doesn’t output toxic content” or “helps with programming tasks in Javascript, but also does not help produce malicious payloads for bad actors,” then we need to work together.

Advertisement

Advertisement

And with that, here’s more AI news.

Sharon Goldman
sharon.goldman@fortune.com

This story was originally featured on Fortune .com

Leave a Comment