← All posts
Linux May 17, 2026 5 min read

The missing mm_struct: How Linux leaked root file descriptors

For years, the Linux kernel harbored a subtle lifetime bug in how it handled task dumpability during process exit. It was conceptually flagged by Jann Horn in 2020, but the practical exploitability of this pattern—allowi

View source on GitHub →

The missing mm_struct: How Linux leaked root file descriptors

For years, the Linux kernel harbored a subtle lifetime bug in how it handled task dumpability during process exit. It was conceptually flagged by Jann Horn in 2020, but the practical exploitability of this pattern—allowing unprivileged users to steal open file descriptors for sensitive files like /etc/shadow or SSH host keys—was fully realized much later. This article breaks down the mm_struct lifetime issue, how pidfd_getfd turned it into an exploit, and the ssh-keysign-pwn proof-of-concept repository.

Contents

  1. The security check that failed open
  2. The anatomy of a process exit
  3. The exploit window
  4. Weaponizing pidfd_getfd
  5. The ssh-keysign exploit
  6. The chage exploit
  7. Escalating to root
  8. The fix: Keeping mm around
  9. Conclusion

1. The security check that failed open

When one process tries to inspect or interact with another process (for example, via ptrace), the kernel needs to ensure the caller has the right permissions. The central function for this is __ptrace_may_access().

To determine if access should be granted, __ptrace_may_access() inspects the target task’s memory descriptor, the mm_struct. Specifically, it checks mm->flags (to see if the process is dumpable) and mm->user_ns. The problem arises when the target task is in the middle of exiting.

2. The anatomy of a process exit

Process termination in Linux is not instantaneous. When a process exits, the kernel tears down its resources sequentially. Two critical steps in this teardown are exit_mm() (which drops the process’s memory map) and exit_files() (which closes all open file descriptors).

Crucially, exit_mm() happens before exit_files(). This means there is a window during task exit where the process has lost its mm_struct, but its file descriptors are still open and attached to the task.

3. The exploit window

So, what happens if __ptrace_may_access() is called during this exact window? The task still exists (so __put_task_struct() hasn’t been called), but task->mm is NULL.

Historically, the kernel handled this by failing open. If the mm was gone, the kernel pretended the task was dumpable. From a security perspective, this is highly dangerous. If an attacker can interact with the exiting task during this window, dumpability protections are bypassed.

4. Weaponizing pidfd_getfd

To exploit this, an attacker needs a way to interact with the target process’s file descriptors. Enter pidfd_getfd(2), a system call that allows a process to obtain a duplicate of a file descriptor from another process.

pidfd_getfd relies on __ptrace_may_access() to verify permissions. If an attacker’s UID matches the exiting process’s UID (which happens when a setuid binary drops privileges back to the caller), the attacker can call pidfd_getfd during the mm == NULL window. Since the dumpability check is bypassed, the kernel happily duplicates the file descriptor, even if it points to a highly privileged file.

5. The ssh-keysign exploit

The ssh-keysign-pwn repository demonstrates a perfect real-world target for this: ssh-keysign.

ssh-keysign is a setuid root helper program used for host-based authentication. When invoked, it opens the private SSH host keys (e.g., /etc/ssh/ssh_host_rsa_key) as root. It then permanently drops privileges to the calling user using permanently_set_uid().

If the configuration (EnableSSHKeysign=no) causes it to bail out, it exits with the host key file descriptors still open. An attacker can repeatedly execute ssh-keysign, racing its exit with pidfd_getfd. If they hit the window, they receive a file descriptor for the private SSH host key, which they can then read to compromise the server’s identity.

6. The chage exploit

Another target is the chage utility, which changes user password expiry information. When running chage -l <user>, the program opens /etc/shadow (which is read-only for root). It then drops privileges by calling setreuid(ruid, ruid).

Just like ssh-keysign, this creates a scenario where an unprivileged user can race the process exit. By successfully using pidfd_getfd during the vulnerable window, the attacker can extract the file descriptor for /etc/shadow and read the system’s password hashes for offline cracking.

7. Escalating to root

So how do you actually go from reading these files to full root privileges? The chage exploit provides a direct path. By extracting the file descriptor for /etc/shadow, you gain read access to the system’s password hashes.

Once you have the root password hash, you can use offline password cracking tools like Hashcat or John the Ripper. If the root account has a weak or guessable password, cracking it will reveal the plaintext password. With that in hand, simply running su - and entering the cracked password grants you a full root shell.

If the ssh-keysign exploit is used instead, obtaining the private SSH host keys allows you to impersonate the server. While this doesn’t directly give you a root shell on the current machine, it enables you to intercept incoming SSH connections or perform Man-in-the-Middle (MitM) attacks against users connecting to the server, potentially capturing their credentials (including those of administrators).

8. The fix: Keeping mm around

Jann Horn originally proposed the shape of this issue in October 2020. The proper fix (eventually merged as commit 31e62c2ebbfd) was to ensure that the task_struct holds a reference to the mm_struct until the task goes away completely.

By ensuring the mm_struct outlives the dumpability checks and the file descriptors, __ptrace_may_access() can reliably inspect the actual dumpable state of the process throughout its entire teardown, closing the fail-open window.

9. Conclusion

The missing mm_struct bug is a classic example of how sequential teardowns in complex systems can create hazardous race conditions. A security check that fails open when data is missing, combined with a relatively new system call like pidfd_getfd, turned a theoretical lifetime issue into a reliable local privilege escalation vector. It underscores the importance of strict reference counting and the dangers of “fail open” defaults in security boundaries.

Join Discord 1,582 members