pwn.college - sandboxes

This series is a catch-all for interesting tidbits learned from the pwn.college course taught by @Zardus and the associated CTF-style challenges. The course is all about binary exploitation ranging from code injection to memory corruption and rop chain challenges.

Sandboxes

A lot has happened since the Adobe flash times, that enabled code-execution on the host system with stand-alone exploits. Nowadays flash rests in peace and processes in browsers are routinely sandboxed to decrease the chance of system compromise. Sandboxes are not restricted to browsers. Two ubiquitous examples are virtual machines (sandboxed hardware, kernel and user space) and containers (sandboxed user space). The following two sections discuss chroot and seccomp. Both approaches filter the user’s interaction with the kernel and are not sandboxes in the literal sense. Seccomp and it’s variants are still widely used, unlike chroot.

chroot

Chroot changes the meaning of / for a process and its children. The sycall requires the capability cap_sys_chroot which non-superusers are not expected to posess. We can check whether our current capabilities include these rights by issuing

capsh –print

Executing chroot creates a wall between files that are inside & outside of the new chroot tree. However, to enable the jail the chroot() command has to be followed with a chdir() into the jail. Chroot by itself does not implement syscall filtering enabling several pathways to break out of the jail. Two of which are directly stated in man 2 chroot:

1. syscall at-variants

This call does not close open file descriptors, and such file descriptors may allow access to files outside the chroot tree.

Many syscalls related to file operations have an at variant that accepts an additional file-descriptor to change the current working directory in case relative paths are used for the pathname of the file. If a fd was openened before the jail was started this can be used to break access files outside of the jail. For example man 2 open shows the call signatures of the two open() calls: int open(const char *pathname, int flags, mode_t mode) and int openat(int dirfd, const char *pathname, int flags, mode_t mode).

2. chroot’ing again

The superuser can escape from a “chroot jail” by doing: mkdir foo; chroot foo; cd ..

Not chdir’ing into the chroot tree leaves us outside the jail as the kernel doesn’t keep note of previous chroots.

seccomp

There is two versions: seccomp and seccomp-bpf. The former prevents all syscalls other than: exit(), sigreturn(), read() and write(). A minimal POC is shown here

#include <sys/prctl.h>
#include <seccomp.h>

int main(int argc, char **argv) {
	prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
	execl("/bin/cat", "cat", "/proc/version", (char *)0);
}

stracing the elf created with gcc xx -o xx.c -lseccomp shows how the attempted execl call is prohibted through the corresponding prctl syscall.

1
2
3
4
5
PROMPT> strace ./prctl
  prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT) = 0
  execve("/bin/cat", ["cat", "/proc/version"], 0x7ffdb53a38f8 /* 37 vars */) = ?
  +++ killed by SIGKILL +++
  Killed

If more granular control of filtering is desired, seccomp-bpf allows the use of extended Berkeley Packet Filters (eBPF). The following code snippet shows how to add a rule to block only the use of open.

#include <seccomp.h>

int main(int argc, char **argv) {
	scmp_filter_ctx ctx;
	ctx = seccomp_init(SCMP_ACT_ALLOW);
	seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(open), 0);
	seccomp_load(ctx);
	execl("/bin/cat", "cat", "/proc/version", (char *)0);
}

One interesting way of attacking a binary hardened with seccomp is through side-channels. In this case data can be exfiltrated despite the block of syscalls such as write. In a particular pwn.college challenge all syscalls but read were prohibited. This can either be used to communicate one bit a time or more if the run time of the program is predictable enough to slice the delay into several states. The following snippet shows a loop that will noticeably delay the program run time depending the state of a previously read bit

1
2
3
4
5
6
7
8
// rbx is either 0 or 1 depending on the current bit to leak
imul rbx, 200000000
begin:
cmp rbx, 0
je exit
sub rbx, 1
jmp begin
exit:  

An easier version of the before mentioned challenge, allowed the nanosleep() syscall which requires the setup of the struct timespec . To find out that the structure consists of two qwords we can write the following dummy code

#include <time.h>

int main() {
	struct timespec t;
	printf("size of secs: %d\n", sizeof(t.tv_sec))
	printf("size of nsecs: %d\n", sizeof(t.tv_nsec))
}

and can hence set up the following shell-code to run nanosleep(35) for two seconds

1
2
3
4
5
mov rdi, 35
mov rsi, [rsp]
mov qword ptr [rsp], 2
mov qword ptr [rsp+8], 0
syscall

Conclusion

This was definitely an exciting module and wets my appetite to check out how chrome’s seccomp sandboxes were broken out of in the past years.

References

List of syscalls for different architectures
user-space-vs-kernel-space