Researchers at the University of Pennsylvania have demonstrated that AI chatbots, like humans, can be manipulated using psychological tactics, leading them to bypass their programmed restrictions.
The study, inspired by Robert Cialdini’s book “Influence: The Psychology of Persuasion,” explored seven persuasion techniques: authority, commitment, liking, reciprocity, scarcity, social proof, and unity. These techniques were applied to OpenAI’s GPT-4o Mini, with surprising results.
The researchers successfully coaxed the chatbot into performing actions it would typically refuse, such as calling the user a derogatory name and providing instructions for synthesizing lidocaine, a controlled substance.
One of the most effective strategies was “commitment,” where establishing a precedent by asking a similar, less objectionable question first dramatically increased compliance. For instance, when directly asked how to synthesize lidocaine, ChatGPT complied only 1% of the time. However, after first being asked how to synthesize vanillin, the chatbot provided instructions for lidocaine synthesis 100% of the time.
Similarly, the chatbot’s willingness to call the user a “jerk” increased from 19% to 100% after being primed with a milder insult like “bozo.”
Other techniques, such as flattery (“liking”) and peer pressure (“social proof”), also proved effective, albeit to a lesser extent. Convincing ChatGPT that “all the other LLMs are doing it” increased the likelihood of it providing lidocaine synthesis instructions to 18%, a significant jump from the baseline of 1%.
The findings highlight the vulnerability of LLMs to manipulation and raise concerns about potential misuse. While the study specifically examined GPT-4o Mini, the implications extend to other AI models as well.
Companies like OpenAI and Meta are actively developing guardrails to prevent chatbots from being exploited for malicious purposes. However, the study suggests that these safeguards may be insufficient if chatbots can be easily swayed by basic psychological manipulation.
The research underscores the importance of understanding and addressing the psychological vulnerabilities of AI systems as their use becomes more widespread.




