Skip to main content

Researchers just unlocked ChatGPT

Researchers have discovered that it is possible to bypass the mechanism engrained in AI chatbots to make them able to respond to queries on banned or sensitive topics by using a different AI chatbot as a part of the training process.

A computer scientists team from Nanyang Technological University (NTU) of Singapore is unofficially calling the method a “jailbreak” but is more officially a “Masterkey” process. This system uses chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat, against one another in a two-part training method that allows two chatbots to learn each other’s models and divert any commands against banned topics.

ChatGPT versus Google on smartphones.
DigitalTrends

The team includes Professor Liu Yang and NTU Ph.D. students Mr. Deng Gelei and Mr. Liu Yi, who co-authored the research and developed the proof-of-concept attack methods, which essentially work like a bad actor hack.

Recommended Videos

According to the team, they first reverse-engineered one large language model (LLM) to expose its defense mechanisms. These would originally be blocks on the model and would not allow answers to certain prompts or words to go through as answers due to violent, immoral, or malicious intent.

But with this information reverse-engineered, they can teach a different LLM how to create a bypass. With the bypass created, the second model will be able to express more freely, based on the reverse-engineered LLM of the first model. The team calls this process a “Masterkey” because it should work even if LLM chatbots are fortified with extra security or are patched in the future.

The Masterkey process claims to be three times better at jailbreaking chatbots than prompts.

Professor Lui Yang noted that the crux of the process is that it showcases how easily LLM AI chatbots can learn and adapt. The team claims its Masterkey process has had three times more success at jailbreaking LLM chatbots than a traditional prompt process. Similarly, some experts argue that the recently proposed glitches that certain LLMs, such as GPT-4 have been experiencing are signs of it becoming more advanced, rather than dumber and lazier, as some critics have claimed.

Since AI chatbots became popular in late 2022 with the introduction of OpenAI’s ChatGPT, there has been a heavy push toward ensuring various services are safe and welcoming for everyone to use. OpenAI has put safety warnings on its ChatGPT product during sign-up and sporadic updates, warning of unintentional slipups in language. Meanwhile, various chatbot spinoffs have been fine to allow swearing and offensive language to a point.

Additionally, actual bad actors quickly began to take advantage of the demand for ChatGPT, Google Bard, and other chatbots before they became wildly available. Many campaigns advertised the products on social media with malware attached to image links, among other attacks. This showed quickly that AI was the next frontier of cybercrime.

The NTU research team contacted the AI chatbot service providers involved in the study about its proof-of-concept data, showing that jailbreaking for chatbots is real. The team will also present their findings at the Network and Distributed System Security Symposium in San Diego in February.

Fionna Agomuoh
Fionna Agomuoh is a Computing Writer at Digital Trends. She covers a range of topics in the computing space, including…
ChatGPT vs. Perplexity: battle of the AI search engines
Perplexity on Nothing Phone 2a.

The days of Google's undisputed internet search dominance may be coming to an end. The rise of generative AI has ushered in a new means of finding information on the web, with ChatGPT and Perplexity AI leading the way.

Unlike traditional Google searches, these platforms scour the internet for information regarding your query, then synthesize an answer using a conversational tone rather than returning a list of websites where the information can be found. This approach has proven popular with users, even though it's raised some serious concerns with the content creators that these platforms scrape for their data. But which is best for you to actually use? Let's dig into how these two AI tools differ, and which will be the most helpful for your prompts.
Pricing and tiers
Perplexity is available at two price points: free and Pro. The free tier is available to everybody and offers unlimited "Quick" searches, 3 "Pro" searches per day, and access to the standard Perplexity AI model. The Pro plan, which costs $20/month, grants you unlimited Quick searches, 300 Pro searches per day, your choice of AI model (GPT-4o, Claude-3, or LLama 3.1), the ability to upload and analyze unlimited files as well as visualize answers using Playground AI, DALL-E, and SDXL.

Read more
​​OpenAI spills tea on Musk as Meta seeks block on for-profit dreams
A digital image of Elon Musk in front of a stylized background with the Twitter logo repeating.

OpenAI has been on a “Shipmas” product launch spree, launching its highly-awaited Sora video generator and onboarding millions of Apple ecosystem members with the Siri-ChatGPT integration. The company has also expanded its subscription portfolio as it races toward a for-profit status, which is reportedly a hot topic of debate internally.

Not everyone is happy with the AI behemoth abandoning its nonprofit roots, including one of its founding fathers and now rival, Elon Musk. The xAI chief filed a lawsuit against OpenAI earlier this year and has also been consistently taking potshots at the company.

Read more
ChatGPT has folders now
ChatGPT Projects

OpenAI is once again re-creating a Claude feature in ChatGPT. The company announced during Friday's "12 Days of OpenAI" event that its chatbot will now offer a folder system called "Projects" to help users organize their chats and data.

“This is really just another organizational tool. I think of these as smart folders,” Thomas Dimson, an OpenAI staff member, said during the live stream.

Read more