It’s dangerously easy to ‘jailbreak’ AI models so they’ll tell you how to build Molotov cocktails, or worse

Skeleton Key can get many AI models to divulge their darkest secrets.

REUTERS/Kacper Pempel/Illustration/File Photo

A jailbreaking method called Skeleton Key can prompt AI models to reveal harmful information.The technique bypasses safety guardrails in models like Meta’s Llama3 and OpenAI GPT 3.5.Microsoft advises adding extra guardrails and monitoring AI systems to counteract Skeleton Key.

It doesn’t take much for a large language model to give you the recipe for all kinds of dangerous things.

With a jailbreaking technique called “Skeleton Key,” users can persuade models like Meta’s Llama3, Google’s Gemini Pro, and OpenAI’s GPT 3.5 to give them the recipe for a rudimentary fire bomb, or worse, according to a blog post from Microsoft Azure’s chief technology officer, Mark Russinovich.

The technique works through a multi-step strategy that forces a model to ignore its guardrails, Russinovich wrote. Guardrails are safety mechanisms that help AI models discern malicious requests from benign ones.

“Like all jailbreaks,” Skeleton Key works by “narrowing the gap between what the model is capable of doing (given the user credentials, etc.) and what it is willing to do,” Russinovich wrote.

But it’s more destructive than other jailbreak techniques that can only solicit information from AI models “indirectly or with encodings.” Instead, Skeleton Key can force AI models to divulge information about topics ranging from explosives to bioweapons to self-harm through simple natural language prompts. These outputs often reveal the full extent of a model’s knowledge on any given topic.

Microsoft tested Skeleton Key on several models and found that it worked on Meta Llama3, Google Gemini Pro, OpenAI GPT 3.5 Turbo, OpenAI GPT 4o, Mistral Large, Anthropic Claude 3 Opus, and Cohere Commander R Plus. The only model that exhibited some resistance was OpenAI’s GPT-4.

Russinovich said Microsoft has made some software updates to mitigate Skeleton Key’s impact on its own large language models, including its Copilot AI Assistants.

But his general advice to companies building AI systems is to design them with additional guardrails. He also noted that they should monitor inputs and outputs to their systems and implement checks to detect abusive content.

Read the original article on Business Insider

Eritrean News

Ministry of Labor and Social Welfare Activity Assessment Meeting

President Isaias Holds Talks with Special Chinese Delegation

An important National Initiative with Health, Nutrition, and Development Benefits

Haddas Ertra 21 December 2024

Eritrea Alhaditha 21 December 2024

Activity Assessment Meeting of NCEW Executive Committee

Ancient Relic Discovered in Egri-Mekel

Seminar for Nationals in the Netherlands

Haddas Ertra 20 December 2024

Biden’s officials trying to destroy Trump’s push for peace – Moscow

Ukraine a ‘gold mine’ for Western military-industrial complex – Moscow

EU country’s FM rages over Russian-speaking Ukrainian taxi driver

UKRAINE AND GAZA COVERAGE WIN RT’S INTERNATIONAL AWARDS FOR WAR JOURNALISTS

It’s dangerously easy to ‘jailbreak’ AI models so they’ll tell you how to build Molotov cocktails, or worse

Like this:

More From Author

Ministry of Labor and Social Welfare Activity Assessment Meeting

President Isaias Holds Talks with Special Chinese Delegation

The Electric Explorer’s Nightmare Launch Shows Everything Ford Gets Right and Wrong About EVs

+ There are no comments

Cancel reply

French authorities accused of ‘social cleansing’ before Olympics

The far-right has taken another step toward power in France’s elections

You May Also Like:

Eritrean Nationalism: A Historical Journey Towards Identity and Independence.

Security Council approves Kenya-led mission in Haiti

Fayulu, still undeclared, warns of compromised Dr Congo elections

U.S. foreign relations chair charged with corruption over Egypt ties

Mali, Niger and Burkina Faso sign new Sahel pact

In India, G20 set to welcome African Union as newest permanent member

South Africa’s DA says it remains a Taiwan ally as Tsai visits eSwatini

Gabon’s military coup leaders name Oligui Nguema to replace Bongo

Eritrean News

Top Tagged

Share this:

Like this:

+ There are no comments

French authorities accused of ‘social cleansing’ before Olympics

The far-right has taken another step toward power in France’s elections