...
 
Commits (121)
A simple program written in C to learn more about the ioctl's used by the
bifrost GPU's kernel driver, so we can start drawing some triangles.
Build mesa with panfrost. Clone panwrap into:
Eventually, this code will become historical.
mesa/src/panfrost/pandecode/
So you have a path like:
/home/alyssa/mesa/src/panfrost/pandecode/panloader/panwrap
(Yes, really.)
Inside panwrap, make a build/ directory, cd in, meson .. . && ninja. You'll get
a libpanwrap.so out in build/panwrap/libpanwrap.so, upload that to the board
you're trying to trace and LD_PRELOAD it in to get a trace (including
disassembly of both Midgard and Bifrost).
If your name is Ryan, uncomment "#define dvalin" in include/mali-ioctl.h and panwrap will be built to trace a Dvalin kernel instead.
Framebuffer memory notes
=========================
The framebuffer is BGRA8888 format. It likely uses 16x16 tiles (TODO:
verify). There is a stride of zeroes every 1856 bytes for unknown
reasons.
---
Zeroes at:
1345548
1347404
1349260
1351116
Deltas of 1856 between zero regions; groups of 1804 valid pixels in
between
2048 = 1856 + 4 * 48?
Coarse job descriptor memory map
================================
E000 (refreshed -- 0x340)
E040 (descriptor)
E060 (VERTEX)
E120 (referenced in VERTEX)
E140 (referenced in TILER)
E170 (referenced in VERTEX + TILER)
E180 (descriptor)
E1A0 (TILER)
E280 (refreshed -- 0x80)
E300 (descriptor set)
E320 (SET_VALUE)
E340 (refreshed -- 0x80)
E380 (descriptor)
E3A0 (FRAGMENT)
E3C0 (soft job chain, refreshed -- 0x28)
Conclusions:
FRAGMENT <= 32 bytes
SET_VALUE <= 32 bytes
VERTEX <= 192 bytes
TILER <= 224 bytes
This is a job-based architecture. All interesting behaviour (shaders,
rendering) is a result of jobs. Each job is sent from the driver across
the shim to the GPU. The job is encoded as a special data structure in
GPU memory.
There are two families of jobs, hardware jobs and software jobs.
Hardware jobs interact with the GPU directly. Software jobs are used to
manipulate the driver. Software jobs set BASE_JD_REQ_SOFT_JOB in the
.core_req field of the atom.
Hardware jobs contain the jc pointer into GPU memory. This points to the
job descriptor. All hardware jobs begin with the job descriptor header
which is found in the shim headers. The header contains a field,
job_type, which must be set according to the job type:
Byte | Job type
----- | ---------
0 | Not started
1 | Null
2 | Set value
3 | Cache flush
4 | Compute
5 | Vertex
6 | (none)
7 | Tiler
8 | Fused
9 | Fragment
This header contains a pointer to the next job, forming sequences of
hardware jobs.
The header also denotes the type of job (vertex, fragment, or tiler).
After the header there is simple type specific information.
Set value jobs follow:
struct tentative_set_value {
uint64_t write_iut; /* Maybe vertices or shader? */
uint64_t unknown1; /* Trace has it set at 3 */
}
Fragment jobs follow:
struct tentative_fragment {
tile_coord_t min_tile_coord;
tile_coord_t max_tile_coord;
uint64_t fragment_fbd;
};
tile_coord_t is an ordered pair for specifying a tile. It is encoded as
a uint32_t where bits 0-11 represent the X coordinate and bits 16-27
represent the Y coordinate.
Tiles are 16x16 pixels. This can be concluded from max_tile_coord in
known render sizes.
Fragment jobs contain an external resource, the framebuffer (in shared
memory / UMM). The framebuffer is in BGRA8888 format.
Vertex and tiler jobs follow the same structure (pointers are 32-bit to
GPU memory):
struct tentative_vertex_tiler {
uint32_t block1[11];
uint32_t addresses1[4];
tentative_shader *shaderMeta;
attribute_buffer *vb[];
attribute_meta_t *attribute_meta[];
uint32_t addresses2[5];
tentative_fbd *fbd;
uint32_t addresses3[1];
uint32_t block2[36];
}
In tiler jobs, block1[8] encodes the drawing mode used:
Byte | Mode
----- | -----
0x01 | GL_POINTS
0x02 | GL_LINES
0x08 | GL_TRIANGLES
0x0A | GL_TRIANGLE_STRIP
0x0C | GL_TRIANGLE_FAN
The shader metadata follows a (very indirect) structure:
struct tentative_shader {
uint64_t shader; /* ORed with 16 bytes of flags */
uint32_t block[unknown];
}
Shader points directly to the compiled instruction stream. For vertex
jobs, this is the vertex shader. For tiler jobs, this is the fragment
shader.
Shaders are 128-bit aligned. The lower 128-bits of the shader metadata
pointer contains flags. Bit 0 is set for all shaders. Bit 2 is set for
vertex shaders. Bit 3 is set for fragment shaders.
The attribute buffers encode each attribute (like vertex data) specified
in the shader.
struct attribute_buffer {
float *elements;
size_t element_size; /* sizeof(*elements) * component_count */
size_t total_size; /* element_size * num_vertices */
}
The attribute buffers themselves have metadata in attribute_meta_t,
which is a uint64_t internally. The lowest byte of attribute_meta_t is
the corresponding attribute number. The rest appears to be flags
(0x2DEA2200). After the final attribute, the metadata will be zero,
acting as a null terminator for the attribute list.
Comparing a simple triangle sample with a texture mapped sample, the
following differences appear between the fragment jobs:
- Different shaders (obviously)
- Texture mapped attributes aren't decoding at all (vec5?)
- Null tripped!
null1: B291E720
null2: B291E700
(null 3 is still NULL)
Notice that these are both SAME_VA addresses 32 bytes apart.
null1 contains texture metadata. In this case, the 0x20 buffer is:
23 00 00 00 00 00 01 00 88 e8 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
null2 contains a(nother) list of addresses, like how attributes are
encoded. In this case, the buffer is:
c0 81 02 02 01 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
It appears null1 is a zero-terminated array of metadata and null2 is the
corresponding zero-terminated array of textures themselves.
- Different uniforms (understandable)
- Shader metadata is different:
No texture map: 00 00 00 00 00 00 08 00 02 06 22 00
^ mask: 01 00 01 00 00 00 0F 00 00 08 20 00
Texture mapped: 01 00 01 00 00 00 07 00 02 0e 02 00
- Addr[8] is a little different, but not necessarily related
- Addr[9] is one byte different (ditto)
import gdb
import gdb.printing
class MaliPtr:
""" Print a GPU pointer """
def __init__(self, val):
self.val = val
def to_string(self):
return hex(self.val)
pp = gdb.printing.RegexpCollectionPrettyPrinter("panloader")
pp.add_printer('gpu_ptr', '^mali_ptr$', MaliPtr)
gdb.printing.register_pretty_printer(None, pp)
/*
* © Copyright 2017-2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#ifndef __MALI_IOCTL_DVALIN_H__
#define __MALI_IOCTL_DVALIN_H__
union mali_ioctl_mem_alloc {
union mali_ioctl_header header;
/* [in] */
struct {
u64 va_pages;
u64 commit_pages;
u64 extent;
u64 flags;
} in;
struct {
u64 flags;
u64 gpu_va;
} out;
} __attribute__((packed));
struct mali_ioctl_get_gpuprops {
u64 buffer;
u32 size;
u32 flags;
};
#define MALI_IOCTL_TYPE_BASE 0x80
#define MALI_IOCTL_TYPE_MAX 0x81
#define MALI_IOCTL_TYPE_COUNT (MALI_IOCTL_TYPE_MAX - MALI_IOCTL_TYPE_BASE + 1)
#define MALI_IOCTL_GET_VERSION (_IOWR(0x80, 0, struct mali_ioctl_get_version))
#define MALI_IOCTL_SET_FLAGS ( _IOW(0x80, 1, struct mali_ioctl_set_flags))
#define MALI_IOCTL_JOB_SUBMIT ( _IOW(0x80, 2, struct mali_ioctl_job_submit))
#define MALI_IOCTL_GET_GPUPROPS ( _IOW(0x80, 3, struct mali_ioctl_get_gpuprops))
#define MALI_IOCTL_POST_TERM ( _IO(0x80, 4))
#define MALI_IOCTL_MEM_ALLOC (_IOWR(0x80, 5, union mali_ioctl_mem_alloc))
#define MALI_IOCTL_MEM_QUERY (_IOWR(0x80, 6, struct mali_ioctl_mem_query))
#define MALI_IOCTL_MEM_FREE ( _IOW(0x80, 7, struct mali_ioctl_mem_free))
#define MALI_IOCTL_HWCNT_SETUP ( _IOW(0x80, 8, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_ENABLE ( _IOW(0x80, 9, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_DUMP ( _IO(0x80, 10))
#define MALI_IOCTL_HWCNT_CLEAR ( _IO(0x80, 11))
#define MALI_IOCTL_DISJOINT_QUERY ( _IOR(0x80, 12, __ioctl_placeholder))
// Get DDK version
// mem jit init
#define MALI_IOCTL_SYNC ( _IOW(0x80, 15, struct mali_ioctl_sync))
#define MALI_IOCTL_FIND_CPU_OFFSET (_IOWR(0x80, 16, __ioctl_placeholder))
#define MALI_IOCTL_GET_CONTEXT_ID ( _IOR(0x80, 17, struct mali_ioctl_get_context_id))
// TLStream acquire
// TLStream Flush
#define MALI_IOCTL_MEM_COMMIT ( _IOW(0x80, 20, struct mali_ioctl_mem_commit))
#define MALI_IOCTL_MEM_ALIAS (_IOWR(0x80, 21, struct mali_ioctl_mem_alias))
#define MALI_IOCTL_MEM_IMPORT (_IOWR(0x80, 22, struct mali_ioctl_mem_import))
#define MALI_IOCTL_MEM_FLAGS_CHANGE ( _IOW(0x80, 23, struct mali_ioctl_mem_flags_change))
#define MALI_IOCTL_STREAM_CREATE ( _IOW(0x80, 24, struct mali_ioctl_stream_create))
#define MALI_IOCTL_FENCE_VALIDATE ( _IOW(0x80, 25, __ioctl_placeholder))
#define MALI_IOCTL_GET_PROFILING_CONTROLS ( _IOW(0x80, 26, __ioctl_placeholder))
#define MALI_IOCTL_DEBUGFS_MEM_PROFILE_ADD ( _IOW(0x80, 27, __ioctl_placeholder))
// Soft event update
// sticky resource map
// sticky resource unmap
// Find gpu start and offset
#define MALI_IOCTL_HWCNT_SET ( _IOW(0x80, 32, __ioctl_placeholder))
// gwt start
// gwt stop
// gwt dump
/// Begin TEST type region
/// End TEST type region
#endif /* __MALI_IOCTL_DVALIN_H__ */
/*
* © Copyright 2017-2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#ifndef __MALI_IOCTL_MIDGARD_H__
#define __MALI_IOCTL_MIDGARD_H__
#define MALI_IOCTL_TYPE_BASE 0x80
#define MALI_IOCTL_TYPE_MAX 0x82
union mali_ioctl_mem_alloc {
struct {
union mali_ioctl_header header;
/* [in] */
u64 va_pages;
u64 commit_pages;
u64 extent;
/* [in/out] */
u64 flags;
/* [out] */
mali_ptr gpu_va;
u16 va_alignment;
u32 :32;
u16 :16;
} inout;
} __attribute__((packed));
#define MALI_IOCTL_TYPE_COUNT (MALI_IOCTL_TYPE_MAX - MALI_IOCTL_TYPE_BASE + 1)
#define MALI_IOCTL_GET_VERSION (_IOWR(0x80, 0, struct mali_ioctl_get_version))
#define MALI_IOCTL_MEM_ALLOC (_IOWR(0x82, 0, union mali_ioctl_mem_alloc))
#define MALI_IOCTL_MEM_IMPORT (_IOWR(0x82, 1, struct mali_ioctl_mem_import))
#define MALI_IOCTL_MEM_COMMIT (_IOWR(0x82, 2, struct mali_ioctl_mem_commit))
#define MALI_IOCTL_MEM_QUERY (_IOWR(0x82, 3, struct mali_ioctl_mem_query))
#define MALI_IOCTL_MEM_FREE (_IOWR(0x82, 4, struct mali_ioctl_mem_free))
#define MALI_IOCTL_MEM_FLAGS_CHANGE (_IOWR(0x82, 5, struct mali_ioctl_mem_flags_change))
#define MALI_IOCTL_MEM_ALIAS (_IOWR(0x82, 6, struct mali_ioctl_mem_alias))
#define MALI_IOCTL_SYNC (_IOWR(0x82, 8, struct mali_ioctl_sync))
#define MALI_IOCTL_POST_TERM (_IOWR(0x82, 9, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_SETUP (_IOWR(0x82, 10, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_DUMP (_IOWR(0x82, 11, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_CLEAR (_IOWR(0x82, 12, __ioctl_placeholder))
#define MALI_IOCTL_GPU_PROPS_REG_DUMP (_IOWR(0x82, 14, struct mali_ioctl_gpu_props_reg_dump))
#define MALI_IOCTL_FIND_CPU_OFFSET (_IOWR(0x82, 15, __ioctl_placeholder))
#define MALI_IOCTL_GET_VERSION_NEW (_IOWR(0x82, 16, struct mali_ioctl_get_version))
#define MALI_IOCTL_SET_FLAGS (_IOWR(0x82, 18, struct mali_ioctl_set_flags))
#define MALI_IOCTL_SET_TEST_DATA (_IOWR(0x82, 19, __ioctl_placeholder))
#define MALI_IOCTL_INJECT_ERROR (_IOWR(0x82, 20, __ioctl_placeholder))
#define MALI_IOCTL_MODEL_CONTROL (_IOWR(0x82, 21, __ioctl_placeholder))
#define MALI_IOCTL_KEEP_GPU_POWERED (_IOWR(0x82, 22, __ioctl_placeholder))
#define MALI_IOCTL_FENCE_VALIDATE (_IOWR(0x82, 23, __ioctl_placeholder))
#define MALI_IOCTL_STREAM_CREATE (_IOWR(0x82, 24, struct mali_ioctl_stream_create))
#define MALI_IOCTL_GET_PROFILING_CONTROLS (_IOWR(0x82, 25, __ioctl_placeholder))
#define MALI_IOCTL_SET_PROFILING_CONTROLS (_IOWR(0x82, 26, __ioctl_placeholder))
#define MALI_IOCTL_DEBUGFS_MEM_PROFILE_ADD (_IOWR(0x82, 27, __ioctl_placeholder))
#define MALI_IOCTL_JOB_SUBMIT (_IOWR(0x82, 28, struct mali_ioctl_job_submit))
#define MALI_IOCTL_DISJOINT_QUERY (_IOWR(0x82, 29, __ioctl_placeholder))
#define MALI_IOCTL_GET_CONTEXT_ID (_IOWR(0x82, 31, struct mali_ioctl_get_context_id))
#define MALI_IOCTL_TLSTREAM_ACQUIRE_V10_4 (_IOWR(0x82, 32, __ioctl_placeholder))
#define MALI_IOCTL_TLSTREAM_TEST (_IOWR(0x82, 33, __ioctl_placeholder))
#define MALI_IOCTL_TLSTREAM_STATS (_IOWR(0x82, 34, __ioctl_placeholder))
#define MALI_IOCTL_TLSTREAM_FLUSH (_IOWR(0x82, 35, __ioctl_placeholder))
#define MALI_IOCTL_HWCNT_READER_SETUP (_IOWR(0x82, 36, __ioctl_placeholder))
#define MALI_IOCTL_SET_PRFCNT_VALUES (_IOWR(0x82, 37, __ioctl_placeholder))
#define MALI_IOCTL_SOFT_EVENT_UPDATE (_IOWR(0x82, 38, __ioctl_placeholder))
#define MALI_IOCTL_MEM_JIT_INIT (_IOWR(0x82, 39, __ioctl_placeholder))
#define MALI_IOCTL_TLSTREAM_ACQUIRE (_IOWR(0x82, 40, __ioctl_placeholder))
#endif /* __MALI_IOCTL_MIDGARD_H__ */
This diff is collapsed.
This diff is collapsed.
/*
* © Copyright 2017-2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU license.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#ifndef __MALI_PROPS_H__
#define __MALI_PROPS_H__
#include "mali-ioctl.h"
#define MALI_GPU_NUM_TEXTURE_FEATURES_REGISTERS 3
#define MALI_GPU_MAX_JOB_SLOTS 16
#define MALI_MAX_COHERENT_GROUPS 16
/* Capabilities of a job slot as reported by JS_FEATURES registers */
#define JS_FEATURE_NULL_JOB (1u << 1)
#define JS_FEATURE_SET_VALUE_JOB (1u << 2)
#define JS_FEATURE_CACHE_FLUSH_JOB (1u << 3)
#define JS_FEATURE_COMPUTE_JOB (1u << 4)
#define JS_FEATURE_VERTEX_JOB (1u << 5)
#define JS_FEATURE_GEOMETRY_JOB (1u << 6)
#define JS_FEATURE_TILER_JOB (1u << 7)
#define JS_FEATURE_FUSED_JOB (1u << 8)
#define JS_FEATURE_FRAGMENT_JOB (1u << 9)
struct mali_gpu_core_props {
/**
* Product specific value.
*/
u32 product_id;
/**
* Status of the GPU release.
* No defined values, but starts at 0 and increases by one for each
* release status (alpha, beta, EAC, etc.).
* 4 bit values (0-15).
*/
u16 version_status;
/**
* Minor release number of the GPU. "P" part of an "RnPn" release
* number.
* 8 bit values (0-255).
*/
u16 minor_revision;
/**
* Major release number of the GPU. "R" part of an "RnPn" release
* number.
* 4 bit values (0-15).
*/
u16 major_revision;
u16 :16;
/**
* @usecase GPU clock speed is not specified in the Midgard
* Architecture, but is <b>necessary for OpenCL's clGetDeviceInfo()
* function</b>.
*/
u32 gpu_speed_mhz;
/**
* @usecase GPU clock max/min speed is required for computing
* best/worst case in tasks as job scheduling ant irq_throttling. (It
* is not specified in the Midgard Architecture).
*/
u32 gpu_freq_khz_max;
u32 gpu_freq_khz_min;
/**
* Size of the shader program counter, in bits.
*/
u32 log2_program_counter_size;
/**
* TEXTURE_FEATURES_x registers, as exposed by the GPU. This is a
* bitpattern where a set bit indicates that the format is supported.
*
* Before using a texture format, it is recommended that the
* corresponding bit be checked.
*/
u32 texture_features[MALI_GPU_NUM_TEXTURE_FEATURES_REGISTERS];
/**
* Theoretical maximum memory available to the GPU. It is unlikely
* that a client will be able to allocate all of this memory for their
* own purposes, but this at least provides an upper bound on the
* memory available to the GPU.
*
* This is required for OpenCL's clGetDeviceInfo() call when
* CL_DEVICE_GLOBAL_MEM_SIZE is requested, for OpenCL GPU devices. The
* client will not be expecting to allocate anywhere near this value.
*/
u64 gpu_available_memory_size;
};
struct mali_gpu_l2_cache_props {
u8 log2_line_size;
u8 log2_cache_size;
u8 num_l2_slices; /* Number of L2C slices. 1 or higher */
u64 :40;
};
struct mali_gpu_tiler_props {
u32 bin_size_bytes; /* Max is 4*2^15 */
u32 max_active_levels; /* Max is 2^15 */
};
struct mali_gpu_thread_props {
u32 max_threads; /* Max. number of threads per core */
u32 max_workgroup_size; /* Max. number of threads per workgroup */
u32 max_barrier_size; /* Max. number of threads that can
synchronize on a simple barrier */
u16 max_registers; /* Total size [1..65535] of the register
file available per core. */
u8 max_task_queue; /* Max. tasks [1..255] which may be sent
to a core before it becomes blocked. */
u8 max_thread_group_split; /* Max. allowed value [1..15] of the
Thread Group Split field. */
enum {
MALI_GPU_IMPLEMENTATION_UNKNOWN = 0,
MALI_GPU_IMPLEMENTATION_SILICON = 1,
MALI_GPU_IMPLEMENTATION_FPGA = 2,
MALI_GPU_IMPLEMENTATION_SW = 3,
} impl_tech :8;
u64 :56;
};
/**
* @brief descriptor for a coherent group
*
* \c core_mask exposes all cores in that coherent group, and \c num_cores
* provides a cached population-count for that mask.
*
* @note Whilst all cores are exposed in the mask, not all may be available to
* the application, depending on the Kernel Power policy.
*
* @note if u64s must be 8-byte aligned, then this structure has 32-bits of
* wastage.
*/
struct mali_ioctl_gpu_coherent_group {
u64 core_mask; /**< Core restriction mask required for the
group */
u16 num_cores; /**< Number of cores in the group */
u64 :48;
};
/**
* @brief Coherency group information
*
* Note that the sizes of the members could be reduced. However, the \c group
* member might be 8-byte aligned to ensure the u64 core_mask is 8-byte
* aligned, thus leading to wastage if the other members sizes were reduced.
*
* The groups are sorted by core mask. The core masks are non-repeating and do
* not intersect.
*/
struct mali_gpu_coherent_group_info {
u32 num_groups;
/**
* Number of core groups (coherent or not) in the GPU. Equivalent to
* the number of L2 Caches.
*
* The GPU Counter dumping writes 2048 bytes per core group,
* regardless of whether the core groups are coherent or not. Hence
* this member is needed to calculate how much memory is required for
* dumping.
*
* @note Do not use it to work out how many valid elements are in the
* group[] member. Use num_groups instead.
*/
u32 num_core_groups;
/**
* Coherency features of the memory, accessed by @ref gpu_mem_features
* methods
*/
u32 coherency;
u32 :32;
/**
* Descriptors of coherent groups
*/
struct mali_ioctl_gpu_coherent_group group[MALI_MAX_COHERENT_GROUPS];
};
/**
* A complete description of the GPU's Hardware Configuration Discovery
* registers.
*
* The information is presented inefficiently for access. For frequent access,
* the values should be better expressed in an unpacked form in the
* base_gpu_props structure.
*
* @usecase The raw properties in @ref gpu_raw_gpu_props are necessary to
* allow a user of the Mali Tools (e.g. PAT) to determine "Why is this device
* behaving differently?". In this case, all information about the
* configuration is potentially useful, but it <b>does not need to be processed
* by the driver</b>. Instead, the raw registers can be processed by the Mali
* Tools software on the host PC.
*
*/
struct mali_gpu_raw_props {
u64 shader_present;
u64 tiler_present;
u64 l2_present;
u64 stack_present;
u32 l2_features;
u32 suspend_size; /* API 8.2+ */
u32 mem_features;
u32 mmu_features;
u32 as_present;
u32 js_present;
u32 js_features[MALI_GPU_MAX_JOB_SLOTS];
u32 tiler_features;
u32 texture_features[3];
u32 gpu_id;
u32 thread_max_threads;
u32 thread_max_workgroup_size;
u32 thread_max_barrier_size;
u32 thread_features;
/*
* Note: This is the _selected_ coherency mode rather than the
* available modes as exposed in the coherency_features register.
*/
u32 coherency_mode;
};
struct mali_ioctl_gpu_props_reg_dump {
union mali_ioctl_header header;
struct mali_gpu_core_props core;
struct mali_gpu_l2_cache_props l2;
u64 :64;
struct mali_gpu_tiler_props tiler;
struct mali_gpu_thread_props thread;
struct mali_gpu_raw_props raw;
/** This must be last member of the structure */
struct mali_gpu_coherent_group_info coherency_info;
} __attribute__((packed));
#endif
......@@ -10,6 +10,7 @@ conf_data.set(
'IS_MMAP64_SEPERATE_SYMBOL',
cc.links(
'''
#undef _FILE_OFFSET_BITS
#include <sys/mman.h>
void *mmap64(void *addr, size_t length, int prot, int flags, int fd,
......@@ -25,6 +26,7 @@ conf_data.set(
'IS_OPEN64_SEPERATE_SYMBOL',
cc.links(
'''
#undef _FILE_OFFSET_BITS
#include <sys/stat.h>
#include <fcntl.h>
......
......@@ -20,7 +20,6 @@
#define __PANLOADER_UTIL_H__
#include <inttypes.h>
#include <config.h>
typedef uint8_t u8;
typedef uint16_t u16;
......@@ -32,65 +31,4 @@ typedef int16_t s16;
typedef int32_t s32;
typedef int64_t s64;
/* ASSERT_SIZEOF_TYPE:
*
* Forces compilation to fail if the size of the struct differs from the given
* arch-specific size that was observed during tracing. A size of 0 indicates
* that the ioctl has not been observed in a trace yet, and thus it's size is
* unconfirmed.
*
* Useful for preventing mistakenly extending the length of an ioctl struct and
* thus, causing all members part of said extension to be located at incorrect
* memory locations.
*/
#ifdef __LP64__
#define ASSERT_SIZEOF_TYPE(type__, size32__, size64__) \
_Static_assert(size64__ == 0 || sizeof(type__) == size64__, \
#type__ " does not match expected size " #size64__)
#else
#define ASSERT_SIZEOF_TYPE(type__, size32__, size64__) \
_Static_assert(size32__ == 0 || sizeof(type__) == size32__, \
#type__ " does not match expected size " #size32__)
#endif
#define __PASTE_TOKENS(a, b) a ## b
/*
* PASTE_TOKENS(a, b):
*
* Expands a and b, then concatenates the resulting tokens
*/
#define PASTE_TOKENS(a, b) __PASTE_TOKENS(a, b)
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
#define OFFSET_OF(type, member) __builtin_offsetof(type, member)
#define YES_NO(b) ((b) ? "Yes" : "No")
#define PANLOADER_CONSTRUCTOR \
static void __attribute__((constructor)) PASTE_TOKENS(__panloader_ctor_l, __LINE__)()
#define PANLOADER_DESTRUCTOR \
static void __attribute__((destructor)) PASTE_TOKENS(__panloader_dtor_l, __LINE__)()
#define msleep(n) (usleep(n * 1000))
/* Semantic logging type.
*
* Raw: for raw messages to be printed as is.
* Message: for helpful information to be commented out in replays.
* Property: for properties of a struct
*
* Use one of panwrap_log, panwrap_msg, or panwrap_prop as syntax sugar.
*/
enum panwrap_log_type {
PANWRAP_RAW,
PANWRAP_MESSAGE,
PANWRAP_PROPERTY
};
#define panwrap_log(...) panwrap_log_typed(PANWRAP_RAW, __VA_ARGS__)
#define panwrap_msg(...) panwrap_log_typed(PANWRAP_MESSAGE, __VA_ARGS__)
#define panwrap_prop(...) panwrap_log_typed(PANWRAP_PROPERTY, __VA_ARGS__)
#endif /* __PANLOADER_UTIL_H__ */
......@@ -10,8 +10,6 @@ is_android = cc.get_define('__ANDROID__') == '1'
test_cc_flags = [
'-O3',
# '-pg',
# '-g',
'-Wall',
'-Wno-unused-parameter',
'-Wno-sign-compare',
......@@ -38,12 +36,20 @@ common_exec_largs += ['-pg']
m_dep = cc.find_library('m', required: true)
dl_dep = cc.find_library('dl', required: false)
pthread_dep = dependency('threads')
spd_dep = dependency('ShaderProgramDisassembler', required: true)
libpanfrost_decode_dep = cc.find_library('libpanfrost_decode', dirs : meson.current_source_dir() + '/../../../../build/src/panfrost/pandecode')
libpanfrost_midgard_dep = cc.find_library('libpanfrost_midgard', dirs : meson.current_source_dir() + '/../../../../build/src/panfrost/midgard')
libpanfrost_bifrost_dep = cc.find_library('libpanfrost_bifrost', dirs : meson.current_source_dir() + '/../../../../build/src/panfrost/bifrost')
util_dep = cc.find_library('libmesa_util', dirs : meson.current_source_dir() + '/../../../../build/src/util')
common_dep = [
m_dep,
dl_dep,
pthread_dep,
libpanfrost_decode_dep,
libpanfrost_midgard_dep,
libpanfrost_bifrost_dep,
util_dep
]
if not cc.has_argument('-Werror=attributes')
......@@ -72,8 +78,10 @@ foreach t: test_c_srcs
endif
endforeach
inc = include_directories('include')
inc = [
include_directories('include'),
include_directories('..'),
]
subdir('include')
subdir('trans')
subdir('panwrap')
......@@ -2,8 +2,6 @@ srcs = [
'panwrap-syscall.c',
'panwrap-util.c',
'panwrap-mmap.c',
'panwrap-decoder.c',
'panwrap-shader.c'
]
shared_library(
......@@ -12,7 +10,6 @@ shared_library(
include_directories: inc,
dependencies: [
common_dep,
spd_dep
],
install: true,
)
This diff is collapsed.
/*
* © Copyright 2017-2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#ifndef PANWRAP_DECODER_H
#define PANWRAP_DECODER_H
#include <mali-ioctl.h>
#include <mali-job.h>
#include "panwrap.h"
int panwrap_replay_jc(mali_ptr jc_gpu_va, bool bifrost);
int panwrap_replay_soft_replay(mali_ptr jc_gpu_va);
#endif /* !PANWRAP_DECODER_H */
......@@ -28,134 +28,11 @@
#include <linux/mman.h>
#endif
#include "public.h"
static LIST_HEAD(allocations);
static LIST_HEAD(mmaps);
#define FLAG_INFO(flag) { flag, #flag }
static const struct panwrap_flag_info mmap_flags_flag_info[] = {
FLAG_INFO(MAP_SHARED),
FLAG_INFO(MAP_PRIVATE),
FLAG_INFO(MAP_ANONYMOUS),
FLAG_INFO(MAP_DENYWRITE),
FLAG_INFO(MAP_FIXED),
FLAG_INFO(MAP_GROWSDOWN),
FLAG_INFO(MAP_HUGETLB),
FLAG_INFO(MAP_LOCKED),
FLAG_INFO(MAP_NONBLOCK),
FLAG_INFO(MAP_NORESERVE),
FLAG_INFO(MAP_POPULATE),
FLAG_INFO(MAP_STACK),
#if MAP_UNINITIALIZED != 0
FLAG_INFO(MAP_UNINITIALIZED),
#endif
{}
};
static const struct panwrap_flag_info mmap_prot_flag_info[] = {
FLAG_INFO(PROT_EXEC),
FLAG_INFO(PROT_READ),
FLAG_INFO(PROT_WRITE),
{}
};
#undef FLAG_INFO
char* pointer_as_memory_reference(mali_ptr ptr)
{
struct panwrap_mapped_memory *mapped;
char *out = malloc(128);
/* First check for SAME_VA mappings, then look for non-SAME_VA
* mappings, then for unmapped regions */
if ((ptr == (uintptr_t) ptr && (mapped = panwrap_find_mapped_mem_containing((void*) (uintptr_t) ptr))) ||
(mapped = panwrap_find_mapped_gpu_mem_containing(ptr))) {
snprintf(out, 128, "alloc_gpu_va_%d + %d", mapped->allocation_number, (int) (ptr - mapped->gpu_va));
return out;
}
struct panwrap_allocated_memory *pos, *mem = NULL;
/* Find the pending unmapped allocation for the memory */
list_for_each_entry(pos, &allocations, node) {
if (ptr >= pos->gpu_va && ptr < (pos->gpu_va + pos->length)) {
mem = pos;
break;
}
}
if (mem) {
snprintf(out, 128, "alloc_gpu_va_%d + %d", mem->allocation_number, (int) (ptr - mem->gpu_va));
return out;
}
/* Just use the raw address if other options are exhausted */
snprintf(out, 128, MALI_PTR_FMT, ptr);
return out;
}
/* On job submission, there will be a -lot- of structures built up in memory.
* While we could decode them, for triangle #1 it's easier to just dump them
* all verbatim, as hex arrays, and memcpy them into the allocated memory
* spaces. The main issue is address fix up, which we also handle here. */
void replay_memory_specific(struct panwrap_mapped_memory *pos, int offset, int len)
{
/* If we don't have write access, no replay :) */
if (!(pos->flags & MALI_MEM_PROT_CPU_WR)) return;
/* Tracking these types of mappings would require more
* sophistication to avoid faulting when reading pages that
* haven't been committed yet, so don't try and read them.
*/
if (pos->flags & MALI_MEM_GROW_ON_GPF) return;
if (pos->flags & MALI_MEM_PROT_GPU_EX) {
if (offset)
panwrap_msg("Shader sync not supported!\n");
/* Shader memory get dumped but not replayed, as the
* dis/assembler is setup in-tree as it is. */
char filename[128];
snprintf(filename, 128, "%s.bin", pos->name);
FILE *fp = fopen(filename, "wb");
fwrite(pos->addr, 1, len, fp);
fclose(fp);
const char *prefix = "";
panwrap_log("%s FILE *f_%s = fopen(\"%s\", \"rb\");\n", prefix, pos->name, filename);
panwrap_log("%s fread(%s, 1, %d, f_%s);\n", prefix, pos->name, len, pos->name);
panwrap_log("%s fclose(f_%s);\n", prefix, pos->name);
} else {
/* Fill it with dumped memory, skipping zeroes */
uint32_t *array = (uint32_t *) pos->addr;
for (uint32_t i = offset / sizeof(uint32_t); i < (offset + len) / sizeof(uint32_t); ++i) {
if (array[i])
panwrap_log("%s%s[%d] = %s;\n", pos->touched[i] ? "// " : "", pos->name, i, pointer_as_memory_reference(array[i]));
}
/* Touch what we have written */
/* TODO: Implement correctly */
// memset(pos->touched + (offset / sizeof(uint32_t)), 1, len / sizeof(uint32_t));
}
panwrap_log("\n");
}
void replay_memory()
{
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
replay_memory_specific(pos, 0, pos->length);
}
}
void panwrap_track_allocation(mali_ptr addr, int flags, int number, size_t length)
{
struct panwrap_allocated_memory *mem = malloc(sizeof(*mem));
......@@ -166,8 +43,6 @@ void panwrap_track_allocation(mali_ptr addr, int flags, int number, size_t lengt
mem->allocation_number = number;
mem->length = length;
panwrap_msg("%llx\n", addr);
list_add(&mem->node, &allocations);
/* XXX: Hacky workaround for cz's board */
......@@ -189,204 +64,32 @@ void panwrap_track_mmap(mali_ptr gpu_va, void *addr, size_t length,
}
}
if (!mem) {
panwrap_msg("Error: Untracked gpu memory " MALI_PTR_FMT " mapped to %p\n",
gpu_va, addr);
panwrap_msg("\tprot = ");
panwrap_log_decoded_flags(mmap_prot_flag_info, prot);
panwrap_log_cont("\n");
panwrap_msg("\tflags = ");
panwrap_log_decoded_flags(mmap_flags_flag_info, flags);
panwrap_log_cont("\n");
printf("// Untracked...\n");
return;
}
mapped_mem = malloc(sizeof(*mapped_mem));
list_init(&mapped_mem->node);
/* Try not to break other systems... there are so many configurations
* of userspaces/kernels/architectures and none of them are compatible,
* ugh. */
#define MEM_COOKIE_VA 0x41000
if (mem->flags & MALI_MEM_SAME_VA && gpu_va == MEM_COOKIE_VA) {
mapped_mem->gpu_va = (mali_ptr) (uintptr_t) addr;
} else {
mapped_mem->gpu_va = gpu_va;
}
mapped_mem->length = length;
mapped_mem->addr = addr;
mapped_mem->prot = prot;
mapped_mem->flags = mem->flags;
mapped_mem->allocation_number = mem->allocation_number;
mapped_mem->touched = calloc(length, sizeof(bool));
list_add(&mapped_mem->node, &mmaps);
list_del(&mem->node);
free(mem);
if (mem->flags & MALI_MEM_SAME_VA && gpu_va == MEM_COOKIE_VA)
gpu_va = (mali_ptr) (uintptr_t) addr;
/* Generate somewhat semantic name for the region */
snprintf(mapped_mem->name, sizeof(mapped_mem->name),
char name[512];
snprintf(name, sizeof(name) -1,
"%s_%d",
mem->flags & MALI_MEM_PROT_GPU_EX ? "shader" : "memory",
mapped_mem->allocation_number);
/* Map region itself */
panwrap_log("uint32_t *%s = mmap64(NULL, %zd, %d, %d, fd, alloc_gpu_va_%d);\n\n",
mapped_mem->name, length, prot, flags, mapped_mem->allocation_number);
panwrap_log("if (%s == MAP_FAILED) printf(\"Error mapping %s\\n\");\n\n",
mapped_mem->name, mapped_mem->name);
}
void panwrap_track_munmap(void *addr)
{
struct panwrap_mapped_memory *mapped_mem =
panwrap_find_mapped_mem(addr);
if (!mapped_mem) {
panwrap_msg("Unknown mmap %p unmapped\n", addr);
return;
}
mem->allocation_number);
list_del(&mapped_mem->node);
free(mapped_mem);
}
pandecode_inject_mmap(gpu_va, addr, length, name);
struct panwrap_mapped_memory *panwrap_find_mapped_mem(void *addr)
{
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
if (pos->addr == addr)
return pos;
}
return NULL;
}
struct panwrap_mapped_memory *panwrap_find_mapped_mem_containing(void *addr)
{
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
if (addr >= pos->addr && addr < pos->addr + pos->length)
return pos;
}
return NULL;
}
struct panwrap_mapped_memory *panwrap_find_mapped_gpu_mem(mali_ptr addr)
{
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
if (pos->gpu_va == addr)
return pos;
}
return NULL;
}
struct panwrap_mapped_memory *panwrap_find_mapped_gpu_mem_containing(mali_ptr addr)
{
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
if (addr >= pos->gpu_va && addr < pos->gpu_va + pos->length)
return pos;
}
return NULL;
}
void
panwrap_assert_gpu_same(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size,
const unsigned char *data)
{
const char *buffer = panwrap_fetch_gpu_mem(mem, gpu_va, size);
for (size_t i = 0; i < size; i++) {
if (buffer[i] != data[i]) {
panwrap_msg("At " MALI_PTR_FMT ", expected:\n",
gpu_va);
panwrap_indent++;
panwrap_log_hexdump_trimmed(data, size);
panwrap_indent--;
panwrap_msg("Instead got:\n");
panwrap_indent++;
panwrap_log_hexdump_trimmed(buffer, size);
panwrap_indent--;
abort();
}
}
}
void
panwrap_assert_gpu_mem_zero(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size)
{
const char *buffer = panwrap_fetch_gpu_mem(mem, gpu_va, size);
for (size_t i = 0; i < size; i++) {
if (buffer[i] != '\0') {
panwrap_msg("At " MALI_PTR_FMT ", expected all 0 but got:\n",
gpu_va);
panwrap_indent++;
panwrap_log_hexdump_trimmed(buffer, size);
panwrap_indent--;
abort();
}
}
}
void __attribute__((noreturn))
__panwrap_fetch_mem_err(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size,
int line, const char *filename)
{
panwrap_indent = 0;
panwrap_msg("\n");
panwrap_msg("INVALID GPU MEMORY ACCESS @"
MALI_PTR_FMT " - " MALI_PTR_FMT ":\n",
gpu_va, gpu_va + size);
panwrap_msg("Occurred at line %d of %s\n", line, filename);
if (mem) {
panwrap_msg("Mapping information:\n");
panwrap_indent++;
panwrap_msg("CPU VA: %p - %p\n",
mem->addr, mem->addr + mem->length - 1);
panwrap_msg("GPU VA: " MALI_PTR_FMT " - " MALI_PTR_FMT "\n",
mem->gpu_va,
(mali_ptr)(mem->gpu_va + mem->length - 1));
panwrap_msg("Length: %zu bytes\n", mem->length);
panwrap_indent--;
if (!(mem->prot & MALI_MEM_PROT_CPU_RD))
panwrap_msg("Memory is only accessible from GPU\n");
else
panwrap_msg("Access length was out of bounds\n");
} else {
panwrap_msg("GPU memory is not contained within known GPU VA mappings\n");
struct panwrap_mapped_memory *pos;
list_for_each_entry(pos, &mmaps, node) {
panwrap_msg(MALI_PTR_FMT " (%p)\n", pos->gpu_va, pos->addr);
}
}
panwrap_log_flush();
abort();
list_del(&mem->node);
free(mem);
}
......@@ -41,92 +41,13 @@ struct panwrap_mapped_memory {
int allocation_number;
char name[32];
bool* touched;
struct list node;
};
/* Set this if you don't want your life to be hell while debugging */
#define DISABLE_CPU_CACHING 1
#define TOUCH_MEMSET(mem, addr, sz, offset) \
memset((mem)->touched + (((addr) - (mem)->gpu_va) / sizeof(uint32_t)), 1, ((sz) - (offset)) / sizeof(uint32_t)); \
panwrap_log("\n");
#define TOUCH_LEN(mem, addr, sz, ename, number, dyn) \
TOUCH_MEMSET(mem, addr, sz, 0) \
panwrap_log("mali_ptr %s_%d_p = pandev_upload(%d, NULL, alloc_gpu_va_%d, %s, &%s_%d, sizeof(%s_%d), false);\n\n", ename, number, (dyn) ? -1 : (int) (((addr) - (mem)->gpu_va)), (mem)->allocation_number, (mem)->name, ename, number, ename, number);
/* Job payloads are touched somewhat different than other structures, due to the
* variable lengths and odd packing requirements */
#define TOUCH_JOB_HEADER(mem, addr, sz, offset, number) \
TOUCH_MEMSET(mem, addr, sz, offset) \
panwrap_log("mali_ptr job_%d_p = pandev_upload(-1, NULL, alloc_gpu_va_%d, %s, &job_%d, sizeof(job_%d) - %d, true);\n\n", number, mem->allocation_number, mem->name, number, number, offset);
#define TOUCH_SEQUENTIAL(mem, addr, sz, ename, number) \
TOUCH_MEMSET(mem, addr, sz, 0) \
panwrap_log("mali_ptr %s_%d_p = pandev_upload_sequential(alloc_gpu_va_%d, %s, &%s_%d, sizeof(%s_%d));\n\n", ename, number, mem->allocation_number, mem->name, ename, number, ename, number);
/* Syntax sugar for sanely sized objects */
#define TOUCH(mem, addr, obj, ename, number, dyn) \
TOUCH_LEN(mem, addr, sizeof(typeof(obj)), ename, number, dyn)
void replay_memory();
void replay_memory_specific(struct panwrap_mapped_memory *pos, int offset, int len);
char *pointer_as_memory_reference(mali_ptr ptr);
void panwrap_track_allocation(mali_ptr gpu_va, int flags, int number, size_t length);
void panwrap_track_mmap(mali_ptr gpu_va, void *addr, size_t length,
int prot, int flags);
void panwrap_track_munmap(void *addr);
struct panwrap_mapped_memory *panwrap_find_mapped_mem(void *addr);
struct panwrap_mapped_memory *panwrap_find_mapped_mem_containing(void *addr);
struct panwrap_mapped_memory *panwrap_find_mapped_gpu_mem(mali_ptr addr);
struct panwrap_mapped_memory *panwrap_find_mapped_gpu_mem_containing(mali_ptr addr);
void panwrap_assert_gpu_same(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size,
const unsigned char *data);
void panwrap_assert_gpu_mem_zero(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size);
void __attribute__((noreturn))
__panwrap_fetch_mem_err(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size,
int line, const char *filename);
static inline void *
__panwrap_fetch_gpu_mem(const struct panwrap_mapped_memory *mem,
mali_ptr gpu_va, size_t size,
int line, const char *filename)
{
if (!mem)
mem = panwrap_find_mapped_gpu_mem_containing(gpu_va);
if (!mem ||
size + (gpu_va - mem->gpu_va) > mem->length ||
!(mem->prot & MALI_MEM_PROT_CPU_RD))
__panwrap_fetch_mem_err(mem, gpu_va, size, line, filename);
return mem->addr + gpu_va - mem->gpu_va;
}
#define panwrap_fetch_gpu_mem(mem, gpu_va, size) \
__panwrap_fetch_gpu_mem(mem, gpu_va, size, __LINE__, __FILE__)
/* Returns a validated pointer to mapped GPU memory with the given pointer type,
* size automatically determined from the pointer type
*/
#define PANWRAP_PTR(mem, gpu_va, type) \
((type*)(__panwrap_fetch_gpu_mem(mem, gpu_va, sizeof(type), \
__LINE__, __FILE__)))
/* Usage: <variable type> PANWRAP_PTR_VAR(name, mem, gpu_va) */
#define PANWRAP_PTR_VAR(name, mem, gpu_va) \
name = __panwrap_fetch_gpu_mem(mem, gpu_va, sizeof(*name), \
__LINE__, __FILE__)
#endif /* __MMAP_TRACE_H__ */
/*
* © Copyright 2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#include "panwrap.h"
#include "panwrap-shader.h"
#include <mali-ioctl.h>
#include <mali-job.h>
#include <stdio.h>
#include <memory.h>
#include <Disasm.h>
/* Routines for handling shader assembly, calling out to external assembler and
* disassemblers. Currently only implemented under Midgard; Bifrost code should
* be integrated here as well in the near future, once an assembler is written
* for that platform. */
/* TODO: expose in meson so Lyude doesn't get annoyed at me for breaking
* Bifrost */
#define SHADER_MIDGARD
#ifdef SHADER_MIDGARD
/* Disassemble the shader itself. */
void
panwrap_shader_disassemble(mali_ptr shader_ptr, int shader_no, int type)
{
struct panwrap_mapped_memory *shaders = panwrap_find_mapped_gpu_mem_containing(shader_ptr);
ptrdiff_t offset = shader_ptr - shaders->gpu_va;
/* Disassemble it at trace time... */
panwrap_log("const char shader_src_%d[] = R\"(\n", shader_no);
DisassembleMidgard(shaders->addr + offset, shaders->length - offset);
panwrap_log(")\";\n\n");
/* ...but reassemble at runtime! */
panwrap_log("//pandev_shader_%s(%s + %zd, shader_src_%d, %d);\n\n",
type == SHADER_FRAGMENT ? "compile" : "assemble",
shaders->name,
offset / sizeof(uint32_t),
shader_no,
type);
}
#else
void
panwrap_shader_disassemble(mali_ptr shader_ptr, int shader_no, int type)
{
panwrap_msg("Shader decoding is not yet supported on non-Midgard platforms\n");
panwrap_msg("No disassembly performed for shader at " MALI_PTR_FMT, shader_ptr);
}
#endif
/*
* © Copyright 2018 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#ifndef PANWRAP_SHADER_H
#define PANWRAP_SHADER_H
#include <mali-ioctl.h>
#include <mali-job.h>
#include "panwrap.h"
#define SHADER_VERTEX JOB_TYPE_VERTEX
#define SHADER_FRAGMENT JOB_TYPE_TILER
void panwrap_shader_disassemble(mali_ptr shader_ptr, int shader_no, int type);
#endif /* !PANWRAP_SHADER_H */
This diff is collapsed.
This diff is collapsed.
......@@ -12,11 +12,6 @@
*
*/
/*
* Various bits and pieces of this borrowed from the freedreno project, which
* borrowed from the lima project.
*/
#ifndef __WRAP_H__
#define __WRAP_H__
......@@ -25,37 +20,12 @@
#include <panloader-util.h>
#include <time.h>
#include "panwrap-mmap.h"
#include "panwrap-decoder.h"
struct panwrap_flag_info {
u64 flag;
const char *name;
};
#define PROLOG(func) \
static typeof(func) *orig_##func = NULL; \
if (!orig_##func) \
orig_##func = __rd_dlsym_helper(#func); \
void __attribute__((format (printf, 2, 3))) panwrap_log_typed(enum panwrap_log_type type, const char *format, ...);
void __attribute__((format (printf, 1, 2))) panwrap_log_cont(const char *format, ...);
void panwrap_log_empty();
void panwrap_log_flush();
void panwrap_log_decoded_flags(const struct panwrap_flag_info *flag_info,
u64 flags);
void ioctl_log_decoded_jd_core_req(mali_jd_core_req req);
void panwrap_log_hexdump(const void *data, size_t size);
void panwrap_log_hexdump_trimmed(const void *data, size_t size);
void panwrap_timestamp(struct timespec *);
bool panwrap_parse_env_bool(const char *env, bool def);
long panwrap_parse_env_long(const char *env, long def);
const char * panwrap_parse_env_string(const char *env, const char *def);
extern short panwrap_indent;
void * __rd_dlsym_helper(const char *name);
#endif /* __WRAP_H__ */
This diff is collapsed.
# Built-binaries
panloader
This diff is collapsed.
This diff is collapsed.
gcc sample.c pandev.c slow-framebuffer.c -I../include -I../build/include -I. -lm -ldl -lpthread -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE=1 -lX11 assemble.c allocate.c
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#include "p_compiler.h"
#include "p_config.h"
#include "p_context.h"
#include "p_defines.h"
#include "p_format.h"
#include "p_screen.h"
#include "p_state.h"
This diff is collapsed.
/*
* © Copyright 2017 The Panfrost Community
*
* This program is free software and is provided to you under the terms of the
* GNU General Public License version 2 as published by the Free Software
* Foundation, and any use by you of this program is subject to the terms
* of such GNU licence.
*
* A copy of the licence is included with the program, and can also be obtained
* from Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
* Boston, MA 02110-1301, USA.
*
*/
#include <stdio.h>
#include "pandev.h"
int main(int argc, char **argv)
{
int fd = pandev_open();
if (fd < 0) {
printf("pandev_open() failed with rc%d\n", -fd);
return -fd;
}
printf("More to come soon :)\n");
return 0;
}
srcs = [
'trans-test.c',
'pandev.c',
'allocate.c',
'assemble.c',
'slow-framebuffer.c',
'trans-builder.c',
'limare-swizzle.c'
]
if is_android
X_dep = []
else
X_dep = cc.find_library('X11', required: true)
endif
executable(
'panloader',
srcs,
include_directories: inc,
dependencies: [common_dep, X_dep],
link_args: common_exec_largs,
install: true
)
This diff is collapsed.