Search K
Appearance
Appearance
Other ways to support HackTricks:
A user namespace is a Linux kernel feature that provides isolation of user and group ID mappings, allowing each user namespace to have its own set of user and group IDs. This isolation enables processes running in different user namespaces to have different privileges and ownership, even if they share the same user and group IDs numerically.
User namespaces are particularly useful in containerization, where each container should have its own independent set of user and group IDs, allowing for better security and isolation between containers and the host system.
setns()
system call or create new namespaces using the unshare()
or clone()
system calls with the CLONE_NEWUSER
flag. When a process moves to a new namespace or creates one, it will start using the user and group ID mappings associated with that namespace.sudo unshare -U [--mount-proc] /bin/bash
By mounting a new instance of the /proc
filesystem if you use the param --mount-proc
, you ensure that the new mount namespace has an accurate and isolated view of the process information specific to that namespace.
When unshare
is executed without the -f
option, an error is encountered due to the way Linux handles new PID (Process ID) namespaces. The key details and the solution are outlined below:
Problem Explanation:
unshare
system call. However, the process that initiates the creation of a new PID namespace (referred to as the "unshare" process) does not enter the new namespace; only its child processes do.%unshare -p /bin/bash%
starts /bin/bash
in the same process as unshare
. Consequently, /bin/bash
and its child processes are in the original PID namespace./bin/bash
in the new namespace becomes PID 1. When this process exits, it triggers the cleanup of the namespace if there are no other processes, as PID 1 has the special role of adopting orphan processes. The Linux kernel will then disable PID allocation in that namespace.Consequence:
PIDNS_HASH_ADDING
flag. This results in the alloc_pid
function failing to allocate a new PID when creating a new process, producing the "Cannot allocate memory" error.Solution:
-f
option with unshare
. This option makes unshare
fork a new process after creating the new PID namespace.%unshare -fp /bin/bash%
ensures that the unshare
command itself becomes PID 1 in the new namespace. /bin/bash
and its child processes are then safely contained within this new namespace, preventing the premature exit of PID 1 and allowing normal PID allocation.By ensuring that unshare
runs with the -f
flag, the new PID namespace is correctly maintained, allowing /bin/bash
and its sub-processes to operate without encountering the memory allocation error.
docker run -ti --name ubuntu1 -v /usr:/ubuntu1 ubuntu bash
To use user namespace, Docker daemon needs to be started with --userns-remap=default
(In ubuntu 14.04, this can be done by modifying /etc/default/docker
and then executing sudo service docker restart
)
ls -l /proc/self/ns/user
lrwxrwxrwx 1 root root 0 Apr 4 20:57 /proc/self/ns/user -> 'user:[4026531837]'
It's possible to check the user map from the docker container with:
cat /proc/self/uid_map
0 0 4294967295 --> Root is root in host
0 231072 65536 --> Root is 231072 userid in host
Or from the host with:
cat /proc/<pid>/uid_map
sudo find /proc -maxdepth 3 -type l -name user -exec readlink {} \; 2>/dev/null | sort -u
# Find the processes with an specific namespace
sudo find /proc -maxdepth 3 -type l -name user -exec ls -l {} \; 2>/dev/null | grep <ns-number>
nsenter -U TARGET_PID --pid /bin/bash
Also, you can only enter in another process namespace if you are root. And you cannot enter in other namespace without a descriptor pointing to it (like /proc/self/ns/user
).
unshare -U [--map-user=<uid>|<name>] [--map-group=<gid>|<name>] [--map-root-user] [--map-current-user]
# Container
sudo unshare -U /bin/bash
nobody@ip-172-31-28-169:/home/ubuntu$ #Check how the user is nobody
# From the host
ps -ef | grep bash # The user inside the host is still root, not nobody
root 27756 27755 0 21:11 pts/10 00:00:00 /bin/bash
In the case of user namespaces, when a new user namespace is created, the process that enters the namespace is granted a full set of capabilities within that namespace. These capabilities allow the process to perform privileged operations such as mounting filesystems, creating devices, or changing ownership of files, but only within the context of its user namespace.
For example, when you have the CAP_SYS_ADMIN
capability within a user namespace, you can perform operations that typically require this capability, like mounting filesystems, but only within the context of your user namespace. Any operations you perform with this capability won't affect the host system or other namespaces.
โ ๏ธ
Therefore, even if getting a new process inside a new User namespace will give you all the capabilities back (CapEff: 000001ffffffffff), you actually can only use the ones related to the namespace (mount for example) but not every one. So, this on its own is not enough to escape from a Docker container.
# There are the syscalls that are filtered after changing User namespace with:
unshare -UmCpf bash
Probando: 0x067 . . . Error
Probando: 0x070 . . . Error
Probando: 0x074 . . . Error
Probando: 0x09b . . . Error
Probando: 0x0a3 . . . Error
Probando: 0x0a4 . . . Error
Probando: 0x0a7 . . . Error
Probando: 0x0a8 . . . Error
Probando: 0x0aa . . . Error
Probando: 0x0ab . . . Error
Probando: 0x0af . . . Error
Probando: 0x0b0 . . . Error
Probando: 0x0f6 . . . Error
Probando: 0x12c . . . Error
Probando: 0x130 . . . Error
Probando: 0x139 . . . Error
Probando: 0x140 . . . Error
Probando: 0x141 . . . Error
Probando: 0x143 . . . Error
Other ways to support HackTricks: