freedreno/afuc: Add emulator for playing with firmware
This is a side project I've been playing with.. It started with the idea that we might want a way to patch fw (#4789) to fix bugs ourselves, and/or add our own pm4 packets to add features which would be easier and/or more efficient to implement in fw than cmdstream.
But first it seems useful to have a better way to compare our understanding of how SQE operates with what the hw does (and also a way to test our fw patches in an environment with more visibility). Also, this seems like a reasonable way to document our understanding of the SQE core.. it's executable documentation!
So I wrote a logical emulator for executing the a6xx SQE fw. It probably isn't complete enough yet to run through the preemption codepaths, but it can get thru the bootstrap and packet-table loading, and all the normal pm4 packets that I've tried to throw at it so far.
And it could be useful to drop the heuristics that we currently use to find the packet-table for normal disassembly, which could be useful for a660 fw (ie. the whole LPAC ring having it's own SQE with fw appended to main SQE fw). Ie. use the emulator to run through the bootstrap code which populates the packet-table. It could also be useful for better parsing zap fw which (at least on a6xx) is just normal AFUC instructions with an embedded ir3 shader, but no packet-table. (With this approach, I think we could even identify which part of the zap shader is the embedded ir3 shader.)
Example: running thru bootstrap:
; a6xx microcode
; Disassembling microcode: /home/robclark/src/linux-firmware/qcom/a630_sqe.fw
; Version: 016ee17a
instruction base: 0x1000
0000: 016ee17a [016ee17a] ; nop
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: w
write: GPR (r)egister, (c)ontrol register, (g)pu register, (m)emory: g
GPU register (name or offset) and value: CP_ROQ_THRESHOLDS_1 0x8040362c
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: w
write: GPR (r)egister, (c)ontrol register, (g)pu register, (m)emory: g
GPU register (name or offset) and value: CP_ROQ_THRESHOLDS_2 0x010000c0
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: s
0001: 01001ecd [01001ecd] ; nop
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: s
0002: 01177001 [01177001] ; nop
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: s
0003: 88020001 mov $02, 0x0001
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: r
GPR: $02: 00000001
0004: a8020080 cwrite $02, [$00 + 0x080], 0x0
CTRL: 0x080: 00000001
0005: 880308c1 mov $03, 0x08c1 ; CP_ROQ_THRESHOLDS_1
GPR: $03: 000008c1
0006: 88020002 mov $02, 0x0002
GPR: $02: 00000002
0007: a8020026 cwrite $02, [$00 + @REG_READ_DWORDS], 0x0
CTRL: @REG_READ_DWORDS: 00000002
0008: a8030027 cwrite $03, [$00 + @REG_READ_ADDR], 0x0
CTRL: @REG_READ_ADDR: 000008c1
0009: 981e5006 mov $0a, $regdata
GPR: $0a: 8040362c
CTRL: @REG_READ_DWORDS: 00000001
CTRL: @REG_READ_ADDR: 000008c2
000a: 995e5006 or $0a, $0a, $regdata
GPR: $0a: 814036ec
CTRL: @REG_READ_DWORDS: 00000000
CTRL: @REG_READ_ADDR: 000008c3
000b: c140000a brne $0a, 0x0, #l000 (#10, 0015)
000c: 8a05002c mov $05, 0x002c << 16
GPR: $05: 002c0000
l000: 0015: a8020026 cwrite $02, [$00 + @REG_READ_DWORDS], 0x0
CTRL: @REG_READ_DWORDS: 00000002
0016: 88050830 mov $05, 0x0830 ; CP_SQE_INSTR_BASE_LO
GPR: $05: 00000830
0017: a8050027 cwrite $05, [$00 + @REG_READ_ADDR], 0x0
CTRL: @REG_READ_ADDR: 00000830
0018: 981e1806 mov $03, $regdata
GPR: $03: 00001000
CTRL: @REG_READ_DWORDS: 00000001
CTRL: @REG_READ_ADDR: 00000831
0019: 981e2006 mov $04, $regdata
GPR: $04: 00000000
CTRL: @REG_READ_DWORDS: 00000000
CTRL: @REG_READ_ADDR: 00000832
001a: a8030018 cwrite $03, [$00 + @MEM_READ_ADDR], 0x0
CTRL: @MEM_READ_ADDR: 00001000
001b: a8040019 cwrite $04, [$00 + @MEM_READ_ADDR+0x1], 0x0
CTRL: @MEM_READ_ADDR+0x1: 00000000
001c: a802001a cwrite $02, [$00 + @MEM_READ_DWORDS], 0x0
CTRL: @MEM_READ_DWORDS: 00000002
001d: 2ba50fff and $05, $addr, 0x0fff
GPR: $05: 016eefff
CTRL: @MEM_READ_ADDR: 00001004
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 00000001
001e: 48a50014 shl $05, $05, 0x0014
GPR: $05: fff00000
001f: 63a60008 rot $06, $addr, 0x0008
GPR: $06: 001ecd01
CTRL: @MEM_READ_ADDR: 00001008
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 00000000
0020: 50c60006 ushr $06, $06, 0x0006
GPR: $06: 00007b34
0021: 98663801 add $07, $03, $06
GPR: $07: 00008b34
0022: 98802002 addhi $04, $04, $00
GPR: $04: 00000000
0023: 881c0080 mov $rem, 0x0080
GPR: $rem: 00000080
0024: a8070018 cwrite $07, [$00 + @MEM_READ_ADDR], 0x0
CTRL: @MEM_READ_ADDR: 00008b34
0025: a8040019 cwrite $04, [$00 + @MEM_READ_ADDR+0x1], 0x0
CTRL: @MEM_READ_ADDR+0x1: 00000000
0026: a81c001a cwrite $rem, [$00 + @MEM_READ_DWORDS], 0x0
CTRL: @MEM_READ_DWORDS: 00000080
0027: a8040058 cwrite $04, [$00 + @LOAD_STORE_HI], 0x0
CTRL: @LOAD_STORE_HI: 00000000
0028: b0e2003c load $02, [$07 + 0x03c], 0x0
GPR: $02: 00000e37
0029: a8028004 cwrite $02, [$00 + @PREEMPT_INSTR], 0x8
CTRL: @PREEMPT_INSTR: 00000e37
002a: a8000060 cwrite $00, [$00 + @PACKET_TABLE_WRITE_ADDR], 0x0
CTRL: @PACKET_TABLE_WRITE_ADDR: 00000000
002b: ac1d0061 (rep)cwrite $addr, [$00 + @PACKET_TABLE_WRITE], 0x0
GPR: $rem: 0000007f
CTRL: @MEM_READ_ADDR: 00008b38
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 0000007f
CTRL: @PACKET_TABLE_WRITE_ADDR: 00000001
CTRL: @PACKET_TABLE_WRITE: 000000be
GPR: $rem: 0000007e
CTRL: @MEM_READ_ADDR: 00008b3c
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 0000007e
CTRL: @PACKET_TABLE_WRITE_ADDR: 00000002
CTRL: @PACKET_TABLE_WRITE: 000000be
GPR: $rem: 0000007d
CTRL: @MEM_READ_ADDR: 00008b40
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 0000007d
CTRL: @PACKET_TABLE_WRITE_ADDR: 00000003
CTRL: @PACKET_TABLE_WRITE: 000000d2
<snip a lot more packet table loading>
GPR: $rem: 00000001
CTRL: @MEM_READ_ADDR: 00008d30
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 00000001
CTRL: @PACKET_TABLE_WRITE_ADDR: 0000007f
CTRL: @PACKET_TABLE_WRITE: 000000be
GPR: $rem: 00000000
CTRL: @MEM_READ_ADDR: 00008d34
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 00000000
CTRL: @PACKET_TABLE_WRITE_ADDR: 00000080
CTRL: @PACKET_TABLE_WRITE: 000000be
002c: 88070001 mov $07, 0x0001
GPR: $07: 00000001
002d: 48420002 shl $02, $02, 0x0002
GPR: $02: 000038dc
002e: 98431001 add $02, $02, $03
GPR: $02: 000048dc
002f: a8020018 cwrite $02, [$00 + @MEM_READ_ADDR], 0x0
CTRL: @MEM_READ_ADDR: 000048dc
0030: a8040019 cwrite $04, [$00 + @MEM_READ_ADDR+0x1], 0x0
CTRL: @MEM_READ_ADDR+0x1: 00000000
0031: a807001a cwrite $07, [$00 + @MEM_READ_DWORDS], 0x0
CTRL: @MEM_READ_DWORDS: 00000001
0032: 2ba70fff and $07, $addr, 0x0fff
GPR: $07: 88000fff
CTRL: @MEM_READ_ADDR: 000048e0
CTRL: @MEM_READ_ADDR+0x1: 00000000
CTRL: @MEM_READ_DWORDS: 00000000
0033: 48e70008 shl $07, $07, 0x0008
GPR: $07: 000fff00
0034: 88020002 mov $02, 0x0002
GPR: $02: 00000002
0035: 88030001 mov $03, 0x0001
GPR: $03: 00000001
0036: ec000000 setsecure $02, #l001
0037: 01000000 nop
0038: 98a32806 or $05, $05, $03
GPR: $05: fff00001
l001: 0039: 98a72806 or $05, $05, $07
GPR: $05: ffffff01
003a: a8050100 cwrite $05, [$00 + 0x100], 0x0
CTRL: 0x100: ffffff01
003b: 880c0841 mov $0c, 0x0841 ; CP_CHICKEN_DBG
GPR: $0c: 00000841
003c: a80c0027 cwrite $0c, [$00 + @REG_READ_ADDR], 0x0
CTRL: @REG_READ_ADDR: 00000841
003d: cbc20005 brne $regdata, b2, #l002 (#5, 0042)
003e: 88020002 mov $02, 0x0002
GPR: $02: 00000002
l002: 0042: c1400007 brne $0a, 0x0, #l003 (#7, 0049)
0043: 880308c2 mov $03, 0x08c2 ; CP_ROQ_THRESHOLDS_2
GPR: $03: 000008c2
l003: 0049: 88020001 mov $02, 0x0001
GPR: $02: 00000001
004a: a8020065 cwrite $02, [$00 + 0x065], 0x0
CTRL: 0x065: 00000001
004b: 88020812 mov $02, 0x0812 ; CP_CP2GMU_STATUS
GPR: $02: 00000812
004c: 88430001 mov $03, 0x0001 << 2
GPR: $03: 00000004
004d: a8020024 cwrite $02, [$00 + @REG_WRITE_ADDR], 0x0
CTRL: @REG_WRITE_ADDR: 00000812
004e: a8030025 cwrite $03, [$00 + @REG_WRITE], 0x0
CTRL: @REG_WRITE: 00000004
004f: a8000080 cwrite $00, [$00 + 0x080], 0x0
CTRL: 0x080: 00000000
0050: d8000000 waitin
0051: 981f0806 mov $01, $data
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit:
Example: Setting up some state and then executing a pm4 packet, and then checking result:
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: w
write: GPR (r)egister, (c)ontrol register, (g)pu register, (m)emory: m
GPU memory offset and value: 0x00000000fffffffc 0xc0ffeeee
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: p
Enter packet (opc or register name), followed by payload: CP_MEMCPY 0x4 0xfffffffc 0x00000000 0x00001234 0x00000000
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: r
GPR: $01: 70758005
GPR: $rem: 00000005
CP_MEMCPY:
0e2b: 981f1806 mov $03, $data
GPR: $03: 00000004
GPR: $rem: 00000004
0e2c: a81f0018 cwrite $data, [$00 + @MEM_READ_ADDR], 0x0
GPR: $rem: 00000003
CTRL: @MEM_READ_ADDR: fffffffc
0e2d: a81f0019 cwrite $data, [$00 + @MEM_READ_ADDR+0x1], 0x0
GPR: $rem: 00000002
CTRL: @MEM_READ_ADDR+0x1: 00000000
0e2e: a803001a cwrite $03, [$00 + @MEM_READ_DWORDS], 0x0
CTRL: @MEM_READ_DWORDS: 00000004
0e2f: 8b1d00a0 mov $addr, 0x00a0 << 24 ; |NRT_ADDR
GPR: $addr: a0000000
0e30: 981ff806 mov $data, $data
GPR: $rem: 00000001
GPR: $addr: a1000000
PIPE: |NRT_ADDR: 00001234
0e31: 981ff806 mov $data, $data
GPR: $rem: 00000000
GPR: $addr: a2000000
PIPE: |NRT_ADDR+0x1: 00000000
0e32: 9803e006 mov $rem, $03
GPR: $rem: 00000004
0e33: 8a1da204 mov $addr, 0xa204 << 16 ; |NRT_DATA
GPR: $addr: a2040000
0e34: 9c1df806 (rep)mov $data, $addr
GPR: $rem: 00000003
PIPE: |NRT_ADDR: 00001238
PIPE: |NRT_ADDR+0x1: 00000000
PIPE: |NRT_DATA: c0ffeeee
CTRL: @MEM_READ_ADDR: 00000000
CTRL: @MEM_READ_ADDR+0x1: 00000001
CTRL: @MEM_READ_DWORDS: 00000003
MEM: 0x0000000000001234: 0xc0ffeeee
GPR: $rem: 00000002
PIPE: |NRT_ADDR: 0000123c
PIPE: |NRT_ADDR+0x1: 00000000
PIPE: |NRT_DATA: 00000000
CTRL: @MEM_READ_ADDR: 00000004
CTRL: @MEM_READ_ADDR+0x1: 00000001
CTRL: @MEM_READ_DWORDS: 00000002
MEM: 0x0000000000001238: 0x00000000
GPR: $rem: 00000001
PIPE: |NRT_ADDR: 00001240
PIPE: |NRT_ADDR+0x1: 00000000
PIPE: |NRT_DATA: 00000000
CTRL: @MEM_READ_ADDR: 00000008
CTRL: @MEM_READ_ADDR+0x1: 00000001
CTRL: @MEM_READ_DWORDS: 00000001
MEM: 0x000000000000123c: 0x00000000
GPR: $rem: 00000000
PIPE: |NRT_ADDR: 00001244
PIPE: |NRT_ADDR+0x1: 00000000
PIPE: |NRT_DATA: 00000000
CTRL: @MEM_READ_ADDR: 0000000c
CTRL: @MEM_READ_ADDR+0x1: 00000001
CTRL: @MEM_READ_DWORDS: 00000000
MEM: 0x0000000000001240: 0x00000000
0e35: d8000000 waitin
0e36: 981f0806 mov $01, $data
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit: d
dump: GPR (r)egisters, (c)ontrol register, (g)pu register, (m)emory: m
GPU memory offset: 0x0000000000001234
MEM: 0x0000000000001234: 0xc0ffeeee
(s)tep, (r)un, (d)ump, (w)rite, (p)acket, (h)elp, (q)uit:
After each step of execution it displays changes in the cores GPR registers, control registers, pipe registers, gpu registers, and gpu memory. For instructions that "repeat" in some for (ie. (rep)
and/or (xmovN)
) the change in state is displayed after each step.