radeonsi/gfx10: fix sgpr/vgpr hardware limit computation
- gfx10 has more vgpr per simd
- when using wave64 we get half the wave count but registers must be counted twice
- make it explicit that we're using WGP mode
Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.