ci: Consistently use -j4 across x86 build jobs and -j8 on ARM.
Our shared runners are set up for concurrent jobs ~= CPUs / 4 (x86) or 8 (ARM). If you use more build processes than that, then jobs may be fighting each other for shared system resources, possibly to the point of failure (we've seen one of the runners OOM on some jobs before, though I'm not sure if this was the cause).
We had a ninja invocation missing its '-j', so it would spawn processes ~= CPUs. We also had an ARM kernel build with -j12 for unclear reasons.