intel/compiler: Move pln emul to the fs_visitor.
Move the pln emul code to the fs_visitor, so we get some optimizations that don't happen at the fs_generator level, mainly better scheduling.
One big caveat of this change is that we don't use NF types and the accumulator anymore, but apparently we don't need the extra precision.
Edited by Rafael Antognolli