gallivm: drop lower_ffma32 let the backend handle it.

This makes a bunch of CL tests execute a lot faster.
49 jobs for !7058 with llvmpipe-cl-scratch in 25 seconds (queued for 2 seconds)
latest merge request