Index draw command stream optimization
Current index draw command stream is not efficient, for example with index array [0, 1000, 2], it needs 1001 VS execution and 1001 varying output space.
But some dump results shows we can cut off this overhead with optimizations in command stream:
/* ============ VS CMD STREAM BEGIN ============= */
/* 0x10010400 (0x00000000) */ 0x10018300 0x30030000 /* UNIFORMS_ADDRESS: address: 0x10018300, size: 48 */
/* 0x10010408 (0x00000008) */ 0x10000280 0x40050000 /* SHADER_ADDRESS: address: 0x10000280, size: 80 */
/* 0x10010410 (0x00000010) */ 0x00201000 0x10000040 /* SHADER_INFO: prefetch: disabled, size: 80 */
/* 0x10010418 (0x00000018) */ 0x00000000 0x10000042 /* VARYING_ATTRIBUTE_COUNT: nr_vary: 1, nr_attr: 1 */
/* 0x10010420 (0x00000020) */ 0x00000003 0x10000041 /* UNKNOWN_1 */
/* 0x10010428 (0x00000028) */ 0x10018340 0x20020000 /* ATTRIBUTES_ADDRESS: address: 0x10018340, size: 1 */
/* 0x10010430 (0x00000030) */ 0x10018350 0x20020008 /* VARYINGS_ADDRESS: address: 0x10018350, size: 1 */
/* 0x10010438 (0x00000038) */ 0x03000001 0x00000000 /* DRAW: num: 3, index_draw: true */
/* 0x10010440 (0x00000040) */ 0x00000000 0x60000000 /* UNKNOWN_2 */
/* 0x10010448 (0x00000048) */ 0x10018360 0x20020000 /* ATTRIBUTES_ADDRESS: address: 0x10018360, size: 1 */
/* 0x10010450 (0x00000050) */ 0x10018370 0x20020008 /* VARYINGS_ADDRESS: address: 0x10018370, size: 1 */
/* 0x10010458 (0x00000058) */ 0x01000001 0x00000000 /* DRAW: num: 1, index_draw: true */
/* 0x10010460 (0x00000060) */ 0x00000000 0x60000000 /* UNKNOWN_2 */
/* 0x10010468 (0x00000068) */ 0x00018000 0x50000000 /* SEMAPHORE_END: index_draw enabled */
/* ============ VS CMD STREAM END =============== */