-
Alyssa Rosenzweig authored
History lesson! In the early days of a Panfrost, we had a library independent of the driver called `panwrap` which would be LD_PRELOAD'ed into a driver to decode its cmdstream in real-time. When upstreaming Panfrost, we realized that we would much rather have this decode functionality maintained in-tree to avoid divergence, but that we could not upstream panwrap because of its use with the legacy API. So we instead dumped GPU memory to the filesystem with an out-of-tree panwrap, and decoded that with the in-tree pandecode module. When we migrated to the new kernel, we just added support for doing this memory dump directly from the driver (via a module "pantrace"). This works, but dumping memory every frame is sloooooooooooooow and error-prone. I figured if we have pandecode in-tree, we might as well link to it directly in the driver, allowing us to decode Panfrost's command streams without dumping memory to the filesystem first. This cleans up the code *substantially* and improves dumping performance by a HUGE margin. I'm talking "several seconds per frame" to "dumping in real-time" kind of jump. Note to users: this removes the environmental option "PANTRACE_BASE". Instead, for equivalent functionality set "PAN_MESA_DEBUG=trace" and redirect stdout to the file of your choosing. This should be debugging Panfrost much more pleasant. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
fc7bcee8