Ollama bug allows drive-by attacks

A now-patched flaw in popular AI model runner Ollama allows drive-by attacks in which a miscreant uses a malicious website to remotely target people’s personal computers, spy on their local chats, and even control the models the victim’s app talks to, in extreme cases by serving poisoned models.
GitLab’s Security Operations senior manager Chris Moberly found and reported the flaw in Ollama Desktop v0.10.0 to the project’s maintainers on July 31. According to Moberly, the team fixed the issue within hours and released the patched software in v0.10.1 — so make sure you’ve applied the update because Moberly on Tuesday published a technical writeup about the attack along with proof-of-concept exploit code.
“Exploiting this in the wild would be trivial,” Moberly told The Register. “There is a little bit of work to build the proper attack infrastructure and to get the interception service working, but it’s something an LLM could write pretty easily.”
The good news: the vulnerable component is a new web service powering the GUI app — not the core Ollama API. Since the new GUI isn’t as well known, and was only available for a few weeks before Moberly found and reported the bug, it’s likely that miscreants didn’t have sufficient time to poke holes in it and exploit the flaw on their own.
“There’s no evidence it was exploited in the wild,” Moberly said. “Hopefully everyone is able to patch before that happens. Those who installed via the official application installers receive auto-updates and just need to restart the app to apply it. Those who installed via Homebrew may need to update manually.”
Ollama is an open-source project for running LLMs on your local machine, and the security issue exists in both the Mac and Windows desktop GUIs, but not the core Ollama API. The bug hasn’t yet been issued a CVE tracker.
The flaw is due to incomplete cross-origin controls in the local web service bundled with the GUI.
CORS Light
Cross-origin controls are a mechanism that governs how webpages can request resources from other webpages, and one of the primary ways they do this is through a security feature called Cross-Origin Resource Sharing (CORS).
By default, browsers enforce the “same-origin policy,” meaning that a webpage from one “origin” (such as a web domain or port) can’t make a request to a different origin, and CORS provides a way to explicitly allow or deny these types of cross-origin requests. This is intended to prevent malicious scripts on one website from accessing sensitive data on someone’s local machine — this is what happens during drive-by attacks — and they use cross-origin requests to deliver malicious payloads.
To prevent this type of malicious activity, CORS does a preflight check to determine if the cross-origin request is safe to send before making the actual request. It’s supposed to do this for most “non-simple” requests: those using HTTP methods other than GET, POST, or HEAD, or custom headers. And this allows the server to determine if the request is permitted.
Some “simple” requests that meet certain criteria detailed here don’t trigger a CORS preflight.
However, and perhaps obviously, a settings-changing POST with application/json isn’t “simple” – and in Moberly’s bug-hunting expedition, his first attempt with that header failed.
He then removed the Content-Type header and tricked the Ollama desktop app into treating it as a “simple” request, thus skipping the CORS preflight check altogether. This allowed Moberly to send the request straight from the browser to the settings endpoint in JavaScript, ultimately changing the local Ollama settings.
PoC Exploit
The exploit PoC, which Moberly published on GitLab, works on a macOS or Windows computer capable of running Ollama, while the “attacker” machine needs to have Ollama with the core API running and a GPU to handle model inferences. It also must be reachable from the victim machine — via the same LAN or internet with port forwarding.
The drive-by attack itself works in two stages:
First, a malicious website uses JavaScript to scan ports 40000 through 65535 on site visitors’ local machines to find the GUI’s random port and then send a fake “simple” POST request to configure a malicious remote server.
While Ollama’s local API uses 11434 as its default TCP port, the GUI uses a second, random port that changes every time the user restarts the app, hence the scanning.
Once the configuration is all set, the attacker can permanently intercept all chat requests, logging every message remotely to the attacker’s server, and modifying every AI response in real-time.
The malicious website would look perfectly normal to the victim, and the drive-by attack requires no user interaction.
“Beyond just spying on user interactions, an attacker can also control the models your app talks to — for example by setting their own system prompts or even serving poisoned models,” Moberly wrote. “This is because the local GUI application is actually querying the attacker’s instance of Ollama.”
The Ollama maintainers didn’t immediately respond to The Register‘s inquiries about the vulnerability, but as Moberly noted after reporting the bug to the project’s team on July 31, they acknowledged his disclosure in about 90 minutes and patched it an hour later. ®