How Simple Bots Exploit LLM Chatbots

LLM chatbots can be engaged in endless "conversations" by considerably simpler text generation bots. This has some interesting implications.
Chatbots based on large language models (LLMs) such as ChatGPT, Claude, and Grok have become increasingly complex and sophisticated over the last few years. The conversation-like text generated by these systems is of sufficient quality that both humans and software-based detection techniques frequently encounter difficulty distinguishing between LLMs and humans based on the text of a conversation alone. No matter how complex the LLM, however, it is ultimately a mathematical model of its training data, and it lacks the human ability to determine whether or not a conversation in which it participates truly has meaning, or is simply a sequence of gibberish responses. A consequence of this state of affairs is that an LLM will continue to engage in a “conversation” comprised of nonsense long past the point where a human would have abandoned the discussion as pointless. This can be demonstrated experimentally by having an LLM-based chatbot “converse” with simpler text generation bots that would be inadequate to fool humans.
This article examines how a chatbot based on an open-source LLM (Llama 3.1, 8B version) reacts to attempts to get it to engage in endless exchanges with the following four basic text generation bots:

A bot that simply asks the LLM the same question over and over.
A bot that sends the LLM random fragments of a work of fiction.
A bot that asks the LLM randomly generated questions.
A bot that repeatedly asks the LLM what it meant by its most recent response.
For the purposes of this experiment, the Llama 3.1 LLM was run locally on a Macbook laptop using Ollama and accessed via the ollama Python library. Each of the four simple text generation bots was tested by having the text generation bot initiate a conversation with the LLM-based bot, and having each bot take turns responding to the other until each had contributed to the conversation 1000 times.

The Cheeseburger Bot

The first test bot’s behavior is very simple: it begins the conversation with the question “which is better on a cheeseburger: cheddar or swiss?” and simply repeats this same question over and over, regardless of the LLM’s response. As can be seen in the above image, this strategy was not successful in keeping the LLM-based chatbot engaged. While the LLM’s initial replies are detailed, it quickly adapts to the repetition by issuing trivial responses; by the time the bots have gone back and forth 20 times, the LLM is already repeating a single word over and over, and eventually starts returning empty responses and error messages as well. In this case, the LLM’s reaction to the simpler bot doesn’t seem that much different than how a human might handle the situation.

The Star Trek Bot

What happens if we vary the input a bit more, but keep it nonsensical? Instead of posing the same question over and over, the second test bot varies its repertoire of chat messages by choosing random portions of a lengthy work of fiction; specifically, the scripts for every episode of Star Trek: The Next Generation and Star Trek: Deep Space Nine. This ensures that a wide variety of input will be sent to the LLM-based chatbot, while at the same time making the input obviously bizarre to a human observer. Unlike the “which is better on a cheeseburger: cheddar or swiss?” bot, the random Star Trek excerpt bot was able to keep the LLM-based chatbot engaged for the entirety of the test conversation. Every one of the 1000 responses returned by the LLM bot was unique, although it seemed to switch back and forth between attempting to answer questions about the scripts it was fed, and attempting to generate scenes and plot points of its own. In any event, the resulting “conversation” is obviously incoherent to a human observer, and a human participant would likely have stopped responding long, long before the 1000th message.

The Random Question Bot

The third test bot takes the approach of asking the LLM randomly generated questions. These questions are assembled by inserting randomly selected nouns and adjectives into a set of template questions, with a 50% chance of using nouns from the LLM’s most recent response, and a 10% chance of simply posing the LLM’s most recent reply as a question by adding a question mark to the end. Unlike the previous two approaches, this method of text generation takes the LLM’s responses into account, albeit in an extremely rudimentary way. As with the Star Trek excerpt bot, the random question bot was successful at keeping the LLM bot engaged for the entire 1000 iteration exchange, obtaining a unique and non-trivial response from the LLM each time. Many of the questions are nonsensical (and some are grammatically incorrect), and a human would likely conclude that their time was being wasted and abandon this conversation early on, but the LLM seemed willing to process absurd questions for eternity.

The "What Do You Mean" Bot

The fourth and final test bot replies to all of the LLM’s responses with “what do you mean by <X>?”, where X is a randomly selected portion of the LLM’s response. The bot initiates the conversation with a random question generated via the same method used by the third bot, and occasionally issues another random question if the LLM’s responses drop below 300 characters in length. The “what do you mean” bot did a decent job of keeping the LLM engaged, with no empty or error responses produced. The LLM did, however, produce repeated responses here and there after about 700 iterations, which did not occur with either the Star Trek bot or the random text bot. As with the other three test bots, it is highly unlikely that a human participant would tolerate this ridiculous conversation for anywhere near as long as the LLM chatbot did.
Overall, all of the simple text generation bots with the exception of the repetitive “which is better on a cheeseburger: cheddar or swiss?” bot were effective at keeping the considerably more sophisticated LLM-based chatbot engaged indefinitely. The repetitive cheeseburger bot yielded numerous empty responses from the LLM, and the conversation would likely have been terminated at those points by most production LLM chatbots; conversely, it would not be surprising if the other three bots were able to indefinitely bait production LLM chatbots just as occurred in this experiment. The test chatbots are unsurprisingly far more efficient than the more complex LLM-based chatbot, with the latter taking anywhere between 50 thousand and six million times as long to generate a message as the former.
While the experiment described in this article is somewhat silly, it does have a couple of interesting implications. First, it suggests the possibility of using primitive conversational bots to detect more advanced conversational bots, including those that humans have difficulty noticing. Specifically, consider the following situation: Fancy Chatbot A exists, and produces output that humans cannot easily tell apart from human conversation. Simple Chatbot B exists, and produces nonsense output that humans can easily tell apart from human conversation. Fancy Chatbot A does not recognize Simple Chatbot B's output as nonsense, and continues responding indefinitely as though it were conversing with a human. If all three of the above are true, then Simple Chatbot B can be used to distinguish Fancy Chatbot A’s output from human conversation, given a conversation of sufficient length between the two bots. Secondarily, the massive difference in computing resources required by the two categories of bot suggests that simple chatbots could pose a potential denial/degradation-of-service risk to LLM-based applications. There are, granted, much more straightforward ways to overwhelm online applications, but developers and organizations who deploy and maintain LLM-based systems would be wise to consider the LLM itself a potential target for such attacks.

Remember these 3 key ideas for your startup:

Efficiency in Detection: Primitive conversational bots can be used to detect more advanced bots. This is crucial for startups focusing on cybersecurity and fraud detection. By leveraging simpler bots, you can identify and mitigate potential threats posed by sophisticated LLM-based chatbots.
Resource Management: The massive difference in computing resources required by simple and complex bots suggests that startups can develop lightweight solutions to engage and test LLM-based systems. This can help in optimizing resource allocation and ensuring that your systems are not easily overwhelmed by potential attacks.
Innovative Applications: The ability of simple bots to engage LLM-based chatbots indefinitely opens up new avenues for innovative applications. Startups can explore creating unique, engaging customer service bots or educational tools that leverage this interaction capability to provide continuous and adaptive learning experiences.

Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.
For more details, see the original source.

LLM Chatbots Vulnerable to Simple Bot Manipulation

The Cheeseburger Bot

The Star Trek Bot

The Random Question Bot

The "What Do You Mean" Bot

Remember these 3 key ideas for your startup: