Skip to main content

Researchers just unlocked ChatGPT

Researchers have discovered that it is possible to bypass the mechanism engrained in AI chatbots to make them able to respond to queries on banned or sensitive topics by using a different AI chatbot as a part of the training process.

A computer scientists team from Nanyang Technological University (NTU) of Singapore is unofficially calling the method a “jailbreak” but is more officially a “Masterkey” process. This system uses chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat, against one another in a two-part training method that allows two chatbots to learn each other’s models and divert any commands against banned topics.

ChatGPT versus Google on smartphones.
DigitalTrends

The team includes Professor Liu Yang and NTU Ph.D. students Mr. Deng Gelei and Mr. Liu Yi, who co-authored the research and developed the proof-of-concept attack methods, which essentially work like a bad actor hack.

According to the team, they first reverse-engineered one large language model (LLM) to expose its defense mechanisms. These would originally be blocks on the model and would not allow answers to certain prompts or words to go through as answers due to violent, immoral, or malicious intent.

But with this information reverse-engineered, they can teach a different LLM how to create a bypass. With the bypass created, the second model will be able to express more freely, based on the reverse-engineered LLM of the first model. The team calls this process a “Masterkey” because it should work even if LLM chatbots are fortified with extra security or are patched in the future.

The Masterkey process claims to be three times better at jailbreaking chatbots than prompts.

Professor Lui Yang noted that the crux of the process is that it showcases how easily LLM AI chatbots can learn and adapt. The team claims its Masterkey process has had three times more success at jailbreaking LLM chatbots than a traditional prompt process. Similarly, some experts argue that the recently proposed glitches that certain LLMs, such as GPT-4 have been experiencing are signs of it becoming more advanced, rather than dumber and lazier, as some critics have claimed.

Since AI chatbots became popular in late 2022 with the introduction of OpenAI’s ChatGPT, there has been a heavy push toward ensuring various services are safe and welcoming for everyone to use. OpenAI has put safety warnings on its ChatGPT product during sign-up and sporadic updates, warning of unintentional slipups in language. Meanwhile, various chatbot spinoffs have been fine to allow swearing and offensive language to a point.

Additionally, actual bad actors quickly began to take advantage of the demand for ChatGPT, Google Bard, and other chatbots before they became wildly available. Many campaigns advertised the products on social media with malware attached to image links, among other attacks. This showed quickly that AI was the next frontier of cybercrime.

The NTU research team contacted the AI chatbot service providers involved in the study about its proof-of-concept data, showing that jailbreaking for chatbots is real. The team will also present their findings at the Network and Distributed System Security Symposium in San Diego in February.

Editors' Recommendations

Fionna Agomuoh
Fionna Agomuoh is a technology journalist with over a decade of experience writing about various consumer electronics topics…
The best custom GPTs to make ChatGPT even more powerful
A person typing on a laptop that is showing the ChatGPT generative AI website.

The introduction of Custom GPTs was one of the most exciting additions to ChatGPT in recent months. These allow you to craft custom chatbots with their own instructions and data by feeding them documents, weblinks, and more to make sure they know what you need and respond how you would like them to.

But you don't have to make your own Custom GPT if you don't want to. Indeed, there are tens of thousands of Custom GPTs already made by engineers around the world, and many of them are very impressive.

Read more
I used ChatGPT to help me make my first game. Don’t make the same mistakes I did
A person typing on a laptop that is showing the ChatGPT generative AI website.

Alongside writing articles about ChatGPT, coming to terms with AI chatbot has been a major mission of mine for the past year. I've found it useful for coming up with recipe ideas from a list of ingredients, writing fun alternate history ideas, and answering board game rules clarifications. But I wanted to see if it could do something more impressive: teach me how to make a game.
The first hurdle
I've wanted to make a game for a while now. I programmed a bunch of basic Flash games when I was a kid -- if you can find my Newgrounds profile, you can have a good laugh at them -- but I've had a few ideas ticking in my mind that have calcified into thoughts that will not shift. I need to make them someday and maybe someday is now.

But knowing how to start making a game isn't easy. I didn't really know what kind of game I was trying to make, or what engine I should use, or how you actually start making a game. Until recently, I just hadn't done it. I'd downloaded Unity once, became intimidated, and uninstalled it.

Read more
This one image breaks ChatGPT each and every time
A laptop screen shows the home page for ChatGPT, OpenAI's artificial intelligence chatbot.

Sending images as prompts to ChatGPT is still a fairly new feature, but in my own testing, it works fine most of the time. However, someone's just found an image that ChatGPT can't seem to handle, and it's definitely not what you expect.

The image, spotted by brandon_xyzw on X (formerly Twitter), presents some digital noise. It's nothing special, really -- just a black background with some vertical lines all over it. But if you try to show it to ChatGPT, the image breaks the chatbot each and every time, without fail.

Read more