mesa issueshttps://gitlab.freedesktop.org/flto/mesa/-/issues2019-06-14T15:21:26Zhttps://gitlab.freedesktop.org/flto/mesa/-/issues/5Address register on GC7000Lite2019-06-14T15:21:26ZJonathan MarekAddress register on GC7000Litedeqp tests using MOVAF/MOVAR are failing, so it looks like something changeddeqp tests using MOVAF/MOVAR are failing, so it looks like something changedhttps://gitlab.freedesktop.org/flto/mesa/-/issues/4gl_FragCoord.zw / gl_FragDepth on GC7000Lite2019-06-14T23:55:44ZJonathan Marekgl_FragCoord.zw / gl_FragDepth on GC7000Lite* Bit 0x01000000 in `RA_EARLY_DEPTH` enables fragcoord.z
* Bit 0x04000000 in `RA_EARLY_DEPTH` enables fragcoord.w
* Blob normally sets 0x40000000 in `RA_EARLY_DEPTH` but not when using fragcoord.zw (need to check which one)
* Need to che...* Bit 0x01000000 in `RA_EARLY_DEPTH` enables fragcoord.z
* Bit 0x04000000 in `RA_EARLY_DEPTH` enables fragcoord.w
* Blob normally sets 0x40000000 in `RA_EARLY_DEPTH` but not when using fragcoord.zw (need to check which one)
* Need to check if gl_FragDepth test works..https://gitlab.freedesktop.org/flto/mesa/-/issues/3texturing problems2019-06-16T01:15:55ZJonathan Marektexturing problemsTexture errors and my debugging/RE notes.
This covers all the GLES2 texture fails for GC3000, there are more mipmap / cubemap related fails on GC7000Lite.
* [ ] G7000Lite texture sampling in vertex shader not working at all
* Use ...Texture errors and my debugging/RE notes.
This covers all the GLES2 texture fails for GC3000, there are more mipmap / cubemap related fails on GC7000Lite.
* [ ] G7000Lite texture sampling in vertex shader not working at all
* Use `VIVS_NTE_DESCRIPTOR_ADDR_MIRROR` instead of `VIVS_NTE_DESCRIPTOR_ADDR`
* change `VIVS_VS_SAMPLER_BASE` (we add the base offset ourselves in the shader)
* missing flush/synchronization after changing texture state (works if 2000 NOPs inserted...)
* [ ] G7000Lite RGB565 texture
* blt engine is using `B4G4R4A4` format for the blit - somehow the 4-bit alpha channel is set to all ones (RGB565 works if the blt engine uses RGB565 for the blit)
* [ ] `linear_nearest` / `nearest_linear` filters broken
* Looks like HW will never choose MIN filter if LOD_CONFIG_MAX is 0, setting it to at least 1 fixes it (on GC7000Lite, SAMP_LOD_MINMAX_MAX, needs to be at least 4 - I guess this means rnndb is wrong and the lower 2 bits don't exist)
* [ ] `nearest` filter behavior is wrong
* caused the `VIVS_TE_SAMPLER_CONFIG0_ROUND_UV` bit, which shouldn't be set
* [ ] cube map texture alignment problem
* The use of `etna_adjust_rs_align` to adjust Y alignment is wrong, as there's no way to tell the texture hardware about this alignment (this affects cubemaps and array textures if they get implemented). In GLES2 this affects A8 / L8A8 formats on GC3000, because the tiling mode has a Y alignment of 4 which is then adjusted to 8.
* [ ] rendering to cube maps is broken (for resource copy / mipmap generation)
* Using the software mipmaps/blit works as a temporary fix, haven't yet figured out the real problem
* [ ] cube map sampling from wrong mipmap layer (GC3000, possibly others)
* HW rounding error when calculating LOD for cube map mipmaps? the deqp test draws (for example) a 8x8 surface and one pixel comes from the 4x4 mipmap layer (other pixels OK). The error seems large because I need to hack in a LOD_BIAS of -3 (-0.09375) for it to not sample from the next layer (-2 is not enough).https://gitlab.freedesktop.org/flto/mesa/-/issues/1Dual 16 mode2019-08-09T03:00:06ZJonathan MarekDual 16 modeGC3000/GC7000Lite have a "dual 16" mode for pixel shaders, which uses 16-bit float values and runs ALU instructions twice as fast. We need to wait for mediump support in mesa (coming for freedreno) but for now we have a patch to force en...GC3000/GC7000Lite have a "dual 16" mode for pixel shaders, which uses 16-bit float values and runs ALU instructions twice as fast. We need to wait for mediump support in mesa (coming for freedreno) but for now we have a patch to force enable it.
By default the GC3000 blob is not using DUAL16 mode. We can force the DUAL16 mode on with the blob using the following ENV var: VC_OPTION=-DUAL16:2 (2 = force on, 1 = detect). For the shading:shading=cel scene in glmark2, this boosts the FPS from ~240 to ~360.
To enable DUAL16 mode:
* GC3000: set 0x20000000 bit in VS_UNIFORM_CACHE
* GC7000Lite: set DUAL16 bit in SH_CONFIG
Other notes:
* In dual-16 mode, two pixels are being processed at once
* `th` (high precision/32bit) registers share storage with `t` (medium precision/16bit) registers: `t` is 8x (4 for each "thread"/pixel) 16-bit values and 'th' is 4x 32-bit values. When mixing highp/mediump the SEL bits determine which half of the mediump registers (which of the two pixels) is used for the instruction. There is a bit to use `th` for the dest (bit_3_31)
* We can use "new immediates" in DUAL16 mode: for 16-bit values (float or int), use amode=6 and encode the 16-bit value directly into the 20 bits of storage for the imm value. 32-bit (20-bit) immediates might work, but only if SEL bits are used (to be tested)?.
* Set 0x01000000 in PS_INPUT_COUNT to get highp gl_FragCoord (in th0/th1). Otherwise it doesn't seem like gl_FragCoord works at all
* TODO: how do we control if varyings are highp or not? `DUAL16` bit in PS_INPUT_COUNT might be related. Highp varyings take 2x the register space.