Update t_vertex_generic.c: Don't do inefficient sharded copies that can be...
mesa/tnl: use memcpy for more efficient copies
Here we replace inefficient copies that can be done in a single instruction.
Diff for Function insert_4f_4
00536700: 8b 02 movl (%rdx), %eax ~~~> 00536700: 0f 10 02 movups (%rdx), %xmm0
00536702: 89 06 movl %eax, (%rsi) ~~~> 00536703: 0f 11 06 movups %xmm0, (%rsi)
00536704: 8b 42 04 movl 0x4(%rdx), %eax ~~~>
00536707: 89 46 04 movl %eax, 0x4(%rsi) ~~~>
0053670a: 8b 42 08 movl 0x8(%rdx), %eax ~~~>
0053670d: 89 46 08 movl %eax, 0x8(%rsi) ~~~>
00536710: 8b 42 0c movl 0xc(%rdx), %eax ~~~>
00536713: 89 46 0c movl %eax, 0xc(%rsi) ~~~>
00536716: c3 retq ===> 00536706: c3 retq
Edited by Augustin Zidek