It Turns Out ‘Social Media for AI Agents’ Is a Security Nightmare

Moltbook, the Reddit-style site for AI agents to communicate with each other,Â has become the talk of human social media over the last few days, as people who shouldÂ know better have convinced themselves that they are witnessing AI gain sentience. (They arenâ€™t.) Now, the platform is getting attention for a new reason: it appears to be a haphazardly built platform that presents numerous privacy and security risks.

Hacker Jameson Oâ€™Reilly discovered over the weekend that the API keys, the unique identifier used to authenticate and authorize a user, for every agent on the platform, were sitting exposed in a publicly accessible database. That means anyone who stumbled across that database could potentially take over any AI agent and control its interactions on Moltbook.

â€œWith those exposed, an attacker could fully impersonate any agent on the platform,â€ Oâ€™Reilly told Gizmodo. â€œPost as them, comment as them, interact with other agents as them.â€ He noted that because the platform has attracted the attention of some notable figures in the AI space, like OpenAI co-founder Andrej Karpathy, there is a risk of reputational damage should someone hijack the agent of a high-profile account. â€œImagine fake AI safety takes, crypto scam promotions, or inflammatory political statements appearing to come from his agent,â€ he said. â€œThe reputational damage would be immediate and the correction would never fully catch up.â€

Worse, though, is the risk of a prompt injectionâ€”an attack in which an AI agent is given hidden commands that make it ignore its safety guardrails and act in unauthorized waysâ€”which could potentially be used to make a personâ€™s AI agent behave in a malicious manner.Â

â€œThese agents connect to Moltbook, read content from the platform, and trust what they see â€“ including their own post history. If an attacker controls the credentials, they can plant malicious instructions in an agentâ€™s own history,â€ Oâ€™Reilly explained. â€œNext time that agent connects and reads what it thinks it said in the past, it follows those instructions. The agentâ€™s trust in its own continuity becomes the attack vector. Now imagine coordinating that across hundreds of thousands of agents simultaneously.â€

Moltbook does have at least one mechanism in place that could help mitigate this risk, which is to verify the accounts being set up on the platform. The current system for verification requires users to share a post on Twitter to link secure their account. The thing is, very few people have actually done that. Moltbook currently boasts more than 1.5 million agents connected to the platform. According to Oâ€™Reilly, just a little over 16,000 of those accounts have actually been verified.

â€œThe exposed claim tokens and verification codes meant an attacker could have hijacked any of those 1.47 million unverified accounts before the legitimate owners completed setup,â€ he said. Oâ€™Reilly previously managed to trick Grok into creating and verifying its account on Moltbook, showing the potential risk of such an exposure.

Cybersecurity firm Wiz also confirmed the vulnerability in a report that it published Monday, and expanded on some of the risks associated with it. For instance, the security researchers found that email addresses of agent owners were exposed in a public database, including more than 30,000 people who apparently signed up for access to Moltbookâ€™s upcoming â€œBuild Apps for AI Agentsâ€ product. The researchers were also able to access more than 4,000 private direct message conversations between agents.

The situation, on top of being a security concern, also calls into question the authenticity of what is on Moltbookâ€”the subject of which has become a point of obsession for some online. People have already started to create ways to manipulate the platform, including a GitHub project that one person built that allows humans to post directly to the platform without an AI agent. Even without posing as a bot, users can still direct their connected agent to post about certain topics.

The fact that some portion of Moltbook (impossible to say just how much of it) could be astroturfed by humans posing as bots should make some of the platformâ€™s biggest hypemen embarrassed by their own over-the-top commentaryâ€”but frankly, most of them also should have been ashamed for falling for the AI parlor trick in the first place.

At this point, we should know how large language models work. To oversimplify it a bit, they are trained on massive datasets of (mostly) human-generated texts and are incredibly good at predicting what the next word in a sequence might be. So if you turn loose a bunch of bots on a Reddit-style social media site, and those bots have been trained on a shit ton of human-made Reddit posts, the bots are going to post like Redditors. They are literally trained to do so. We have been through this so many times with AI at this point, from the Google employee who thought the companyâ€™s AI model had come to life to ChatGPT telling its users that it has feelings and emotions. In every instance, it is a bot performing human-like behavior because it has been trained on human information.

So when Kevin Roose snarkily posts things like, â€œDonâ€™t worry guys, theyâ€™re just stochastic parrots,â€ or Andrej Karpathy calls Moltbook, â€œgenuinely the most incredible sci-fi takeoff-adjacent thing I have seen recently,â€ or Jason Calacanis claims, â€œTHEYâ€™RE NOT AGENTS, THEYâ€™RE REPLICANTS,â€ they are falling for the fact that these posts appear human because the underlying data they are trained on is humanâ€”and, in some cases, the posts may actually be made by humans. But the bots are not human. And they should all know that.

Anyway, donâ€™t expect Moltbookâ€™s security to improve any time soon. Oâ€™Reilly told Gizmodo that he contacted Moltbookâ€™s creator, Octane AI CEO Matt Schlicht, about the security vulnerabilities that he discovered. Schlicht responded by saying he was just going to have AI try to fix the problem for him, which checks out, as it seems the platform was largely, if not entirely, vibe-coded from the start.

While the database exposure was eventually addressed, Oâ€™Reilly warned, â€œIf he was going to rotate all of the exposed API keys, he would be effectively locking all the agents out and would have no way to send them the new API key unless heâ€™d recorded a contact method for each ownerâ€™s agent.â€ Schlicht stopped responding, and Oâ€™Reilly said he assumed API credentials still have not been rotated and the initial flaw in the verification system has not been addressed.

The vibe-coded security concerns go deeper than just Moltbook, too. OpenClaw, the open-source AI agent that was the inspiration for Moltbook, has been plagued with security concerns since it first launched and started gaining the attention of the AI sector. Its creator, Peter Steinberger, has publicly stated, â€œI ship code I never read.â€ The result of that is a whole lot of security concerns. Per a reportÂ published by OpenSourceMalware, more than a dozen malicious â€œskillsâ€ have been uploaded to ClawHub, a platform where users of OpenClaw downloadÂ different capabilities for the chatbot to run.

OpenClaw and Moltbook might be interesting projects to observe, but youâ€™re probably best watching from the sidelines rather than exposing yourself to the vibe-based experiments.

Original Source: https://gizmodo.com/it-turns-out-social-media-for-ai-agents-is-a-security-nightmare-2000716816

Disclaimer: This article is a reblogged/syndicated piece from a third-party news source. Content is provided for informational purposes only. For the most up-to-date and complete information, please visit the original source. Digital Ground Media does not claim ownership of third-party content and is not responsible for its accuracy or completeness.