Ai Engineering 3 min read

Nvidia GPUs Compromised by Root-Level Rowhammer Attacks

Researchers demonstrate GDDRHammer and GeForge exploits, using Nvidia GPU memory bit flips to gain full root control over host CPU systems.

Two independent research teams have demonstrated new Rowhammer attacks that exploit Nvidia’s GDDR6 memory to gain full root control over a host machine. The vulnerabilities cross the traditional boundary between graphics hardware and the central processor. For developers relying on shared GPU instances to run LLMs locally or in the cloud, this shifts hardware isolation from a performance optimization to a critical security dependency.

Exploiting GPU Page Tables

The first exploit, GDDRHammer, targets the Ampere architecture by testing against the RTX 6000. It relies on memory massaging to align GPU page tables into regions of memory known to be susceptible to electrical disturbance. Attackers can then induce an average of 129 bit flips per memory bank. This represents a 64-fold increase in disruption compared to 2025’s GPUHammer vulnerability.

Once the page table bits are flipped, the attacker rewrites memory entries to gain arbitrary read and write access across the entire GPU memory space. Modern systems often lack isolation between GPU and CPU address spaces. This lack of boundaries allows the attacker to reach directly into the host CPU’s memory.

GeForge and Root Escalation

The second paper details GeForge, a related exploit path targeting the last-level page directory rather than the page table itself. Testing on an RTX 3060 yielded 1,171 bit flips, while the RTX 6000 saw 202 flips.

The researchers used these bit flips to forge GPU page tables directly. The resulting proof-of-concept payload opens a root shell window on the host machine. This confirms that an unprivileged user sharing a GPU instance can escalate privileges to execute code on the underlying host operating system. This presents a severe threat model for organizations managing untrusted code execution or multi-agent systems on shared tenant nodes.

Hardware and Configuration Dependencies

Both attacks depend on the host machine having its Input-Output Memory Management Unit (IOMMU) disabled. System administrators frequently disable IOMMU in BIOS configurations to reduce latency and performance overhead during heavy compute workloads.

The primary attack vector targets high-performance data centers. Malicious tenants renting fractions of an $8,000 GPU can attack the underlying server infrastructure from within their allocated slice. Both research teams focused on the Ampere generation of hardware. The newer Ada Lovelace generation uses a different GDDR implementation that researchers have not yet reverse-engineered, leaving its vulnerability status unconfirmed.

If you manage bare-metal GPU clusters or utilize multi-tenant AI infrastructure, audit your hardware configurations immediately. Ensure IOMMU is enabled on all hosts handling untrusted workloads, and isolate your production inference clusters behind strict hypervisor boundaries until hardware-level mitigations are available.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading