As a tool that usually runs as root in order to access the data in various log and proc and sys files, we have to be extra careful while also processing untrusted data -- the coredumps and process state are entirely under the control of the untrusted processes. This doesn‘t mean we can’t take steps to reduce privilege and access to runtime state we don't need, but we do have to be careful as we reduce access to not also break our ability to collect data.
With that out of the way, the current state of the world: all processes run as root all the time. Oops. That doesn't mean we want to stay here -- this section includes various ideas, suggestions, and brainstorms for how we can reduce even further.
Our ultimate goals at all times are to run with minimal access to resources and capabilities. Further splitting up these tools into discrete helpers/components will probably help with keeping clear lines for which tools need which privileges.
anomaly_detector needs read access to /var/log/messages and write access to /var/spool/crash/. crash_reporter_logs.conf needs read access to dmesg and some sysfs PCI registers for the kernel_warning_collector. At the very least, we should be able to have this minijail itself.
crash_sender runs in a limited minijail, but still as root, and accesses servers over the network. It needs write access to all the crash spool directories, as well as its own internal /var/lib/crash_sender/ state, and /run/lock/crash_sender. It only needs read access to /run/crash_reporter/ for testing state. Otherwise, the only mutable paths it needs are entirely self contained in the spool paths, so dropping access to everything else should be doable. Another idea is to fork a child for parsing the crash reports and communicating with the network -- it would be able to drop privs and run under a restrictive seccomp filter. The only content it would need access to is the specific set of crash reports.
crash_reporter is a bit of a beast. Not only does it need write access to /var/lib/crash_reporter/ and the crash spool directories, but it also needs read access to many /proc/ files (especially the /proc/<pid>/ of the crashing process which usually have read restrictions in place based on the crashing process's uid/gid), as well as all the random supplemental log sources in crash_reporter_logs.conf. Perhaps during early startup, we setuid to an account for most work, and we only setuid(0) again when we need to read a restricted path, and then we setuid back. Or we drop all caps except CAP_DAC_OVERRIDE assuming the kernel still allows us to access all the paths we need to. For helper programs we run (most notably core2md), we should be able to run them in a more restrictive environment as they are data-in/data-out tools.
Access to the spool dirs is needed by only these tools and feedback reports (which are gathered as root by debugd). So we should be able to change all of these from root:root to a new dedicated account like crash:crash, as well as using that account for dropping privs. This is tracked in https://crbug.com/441427.
Some crash report are considered to be especially likely to contain sensitive user information and are stored in the cryptohome. Right now these are stored in /home/user/<user_hash>/crash, but /home/user/<user_hash> is only traversable by user chronos and group chronos-access. This means that anything writing into that spool directory currently must also acquire permission to read large parts of the user data stored in the cryptohome, which limits our ability to have lower privilege processes record crashes here.
Worse, it creates a potential privilege escalation vector because crash_sender may end up processing reports with a higher privilege level then is required to write to the directory. This means a lower privileged process could set up the spool directory in an unexpected way to trick crash_sender by e.g. creating symlinks, or modifying it at the same time as crash_sender is accessing it. This has been a source of many historical vulnerabilities.
Fortunately, we now have another set of paths that form part of the cryptohome, mounted under /home/root/<user_hash>. Directories under this path are created there by cryptohomed and bind mounted to /var/daemon-store/*/<user_hash>, which can be traversed to by any process. Therefore, we can create a new crash sub-directory owned by crash:crash-user-access and processes that need to produce encrypted crash reports can be given access to only this path by making them members of crash-user-access. crash_sender will also be able to access this directory while having strictly less privilege then any process that creates crash reports. The new /home/root/<user_hash>/crash directory should eventually replace the /home/user/<user_hash>/crash directory entirely.
This leaves some residual risk that one crash reporting process will compromise another via this shared directory. Crash reporters interact with it in two ways. Once by reading the filenames to determine if the directory is full, which is unlikely to be exploitable by writing things to the directory, and later by writing out files into the directory. This could be exploited by tricking the process into writing to a symlink, but most (possibly all) writes to spool directories open files using O_CREAT|O_EXCL to ensure they only write to newly created ordinary files.
Here we cover some vulnerabilities that were found in crash-reporter. Hopefully by understanding the types of bugs that hit us in the past, we can design a system that disables entire classes of bugs rather than simply fixing each of these in a one-off fashion.
This bug allowed the chronos user to read any file as root (including memory of processes via /proc/ symlinks).
The scenario is as follows:
/home/chronos/Consent To Send Stats.chrome://crash)./proc/*/cmdline files.The fix for this was a directed one:
There are a few alternative ways this could have been addressed, albeit with a lot more disruption to the overall system.
/proc so users can only see their own processes.This bug allowed any user on the system to get root execution.
The scenario is as follows:
/tmp/crash_reporter/<pid>/ with hardcoded filenames (e.g. environ) to hold intermediate state./home/chronos/drop/./home/chronos/drop/environ to /proc/sys/kernel/core_pattern./tmp/crash_reporter/getpid() to /home/chronos/drop.core_pattern (e.g. |/bin/bash /home/...)./tmp/crash_reporter/<pid>/ already exists./proc/<pid>/environ to /tmp/crash_reporter/<pid>/environ.core_pattern.A few directed fixes went in first:
/tmp/crash_reporter/ is created and owned as root during early boot so no one else could hijack it./tmp state to /run which is only accessible by root and the daemon that initialized the specific subdir (e.g. crash-reporter).After that, some class fixes went in:
https://chromium-review.googlesource.com/672943 & https://chromium-review.googlesource.com/678294 & https://chromium-review.googlesource.com/723869: Have crash_reporter & crash_sender always run in unique mount namespaces and create unique & empty /tmp mounts. Now any attacks via shared /tmp are impossible.
We also mount /proc read-only so any write attacks to sysctl paths are impossible.
https://chromium-review.googlesource.com/753406: Make /proc/self/mem read-only for all processes (a noexec bypass).
There are a few class fixes that would help with this:
This bug allowed the chronos user to get root execution.
The scenario is as follows:
chrome://crash)..meta file created under their chronos-owned /home/chronos/<user_hash>/crash/ spool directory.upload_ and followed by a sed script. e.g. /p;s^.*^setsid${IFS}bash${IFS}<a-shell-script>${IFS}\&^ep;/=1..meta file.upload_... key is extracted and passed to sed verbatim.e sed command executes an arbitrary command via system().The quick directed fix was to validate all input .meta reports:
.meta file contains any bad content, we just delete the report.After that, some class fixes went in:
r/w/e commands (read file/write file/execute) always unavailable and prevents arbitrary code exec ever again.system(), file redirects, and pipelines.There are a few class fixes that would help with this:
sed or awk and thus any arbitrary code execution they introduce.This bug allowed people to chown arbitrary paths to chronos as root.
The scenario is as follows:
crash spool directory in their profile.crash to an arbitrary path.crash_reporter is invoked by the kernel as root. It derefs the symlink and then chowns the target to chronos.The fix was to improve the spool directory walking code to avoid any TOCTOU races, and to have all filesystem operations avoid derefing any symlinks.
The larger class fix involved blocking symlinks entirely.