panfrost: Make pan_merge macro more robust

Consider the following innocuous-looking code:

   pan_merge(packed, vtx->attributes[i], ATTRIBUTE);

Under the current implementation, this code is completely broken. Why?
The current implemention is a macro which hardcodes the loop index i,
which shadows the i used to index attributes. Pull out a helper method
so we do the right thing without resulting to further preprocessor abuse
(__COUNTER__).

While making things more robust, assert the crucial pan_merge
invariant that the total size is a multiple of 4; if this fails, the
routine risks memory corruption.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <!14119>
34 jobs for !14119 with merge in 8 minutes and 28 seconds (queued for 10 seconds)
latest merge request