bar.sync encoding incorrect for GM107
Submitted by Tom de Vries
Assigned to Nouveau Project
Link to original bug (#106132)
Description
[ Quoted text copied from https://lists.freedesktop.org/archives/mesa-dev/2016-March/108926.html ]
I've been playing around with bar.sync in ptx, JIT-compiling it to GM107 (my quadro m1200 card), and disassembling with cuobjdump -sass.
I looked at nv50_ir_emit_gm107.cpp to understand the instruction encoding.
[Mesa-dev] [PATCH] gm107/ir: add emission for BAR Samuel Pitoiset samuel.pitoiset at gmail.com Tue Mar 1 17:44:42 UTC 2016
- // barrier id
- if (insn->src(0).getFile() == FILE_GPR) {
emitGPR(0x08, insn->src(0));
- } else {
ImmediateValue *imm = insn->getSrc(0)->asImm();
assert(imm);
emitField(0x08, 8, imm->reg.data.u32);
emitField(0x2b, 1, 1);
- }
- // thread count
- if (insn->src(1).getFile() == FILE_GPR) {
emitGPR(0x14, insn->src(1));
- } else {
ImmediateValue *imm = insn->getSrc(0)->asImm();
This should probably be using getSrc(1) ?
assert(imm);
emitField(0x14, 12, imm->reg.data.u32);
emitField(0x2c, 1, 1);
- }
From these examples: ... x x BAR.SYNC 0x0; /* 0xf0a81b8000070000 / BAR.SYNC 0xf; / 0xf0a81b8000070f00 / BAR.SYNC R0; / 0xf0a80b8000070000 / BAR.SYNC R2; / 0xf0a80b8000070200 */ ... I derive these offsets:
- barrier id : 0x08
- barrier id imm vs reg: 0x2c
and from these examples: ... x x BAR.SYNC 0x0; /* 0xf0a81b8000070000 / BAR.SYNC 0x0, R2; / 0xf0a8138000270000 */ ... I derive these offsets:
- thread count : 0x14
- thread count imm vs reg: 0x2b
But when looking at the code snippet above, the roles seem reversed: 0x2b is used for barrier id, and 0x2c is used for the thread count.
Version: git