nir: Use remove_and_dce for nir_shader_lower_instructions().
Reduces the work that other shader passes have to do to look at dead code, and possibly extra rounds around the optimization loop if dce wasn't the last pass in it.
shader-db runtime -1.12919% +/- 0.264337% (n=49) on SKL.