Default to one thread per CPU, instead of 16 regardless of the number of CPUs This should be a better default for systems with few (or very many) cores. Also, since we began starting the longest-running tests first, there's no longer any need for extra parallelism just to get those tests to start earlier.