1The execve system call can grant a newly-started program privileges that 2its parent did not have. The most obvious examples are setuid/setgid 3programs and file capabilities. To prevent the parent program from 4gaining these privileges as well, the kernel and user code must be 5careful to prevent the parent from doing anything that could subvert the 6child. For example: 7 8 - The dynamic loader handles LD_* environment variables differently if 9 a program is setuid. 10 11 - chroot is disallowed to unprivileged processes, since it would allow 12 /etc/passwd to be replaced from the point of view of a process that 13 inherited chroot. 14 15 - The exec code has special handling for ptrace. 16 17These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a 18new, generic mechanism to make it safe for a process to modify its 19execution environment in a manner that persists across execve. Any task 20can set no_new_privs. Once the bit is set, it is inherited across fork, 21clone, and execve and cannot be unset. With no_new_privs set, execve 22promises not to grant the privilege to do anything that could not have 23been done without the execve call. For example, the setuid and setgid 24bits will no longer change the uid or gid; file capabilities will not 25add to the permitted set, and LSMs will not relax constraints after 26execve. 27 28To set no_new_privs, use prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0). 29 30Be careful, though: LSMs might also not tighten constraints on exec 31in no_new_privs mode. (This means that setting up a general-purpose 32service launcher to set no_new_privs before execing daemons may 33interfere with LSM-based sandboxing.) 34 35Note that no_new_privs does not prevent privilege changes that do not 36involve execve. An appropriately privileged task can still call 37setuid(2) and receive SCM_RIGHTS datagrams. 38 39There are two main use cases for no_new_privs so far: 40 41 - Filters installed for the seccomp mode 2 sandbox persist across 42 execve and can change the behavior of newly-executed programs. 43 Unprivileged users are therefore only allowed to install such filters 44 if no_new_privs is set. 45 46 - By itself, no_new_privs can be used to reduce the attack surface 47 available to an unprivileged user. If everything running with a 48 given uid has no_new_privs set, then that uid will be unable to 49 escalate its privileges by directly attacking setuid, setgid, and 50 fcap-using binaries; it will need to compromise something without the 51 no_new_privs bit set first. 52 53In the future, other potentially dangerous kernel features could become 54available to unprivileged tasks if no_new_privs is set. In principle, 55several options to unshare(2) and clone(2) would be safe when 56no_new_privs is set, and no_new_privs + chroot is considerable less 57dangerous than chroot by itself. 58