Commit 5a5dc4f5 authored by Alyssa Rosenzweig's avatar Alyssa Rosenzweig 💜

Squash early Midgard driver

History preserved in a branch.

Rebase meson.build

Fix syntax errors in the meson.build
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Import ir3_cmdline.c from freedreno into panfrost
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Begin removing freedreno-specific code in midgard
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Fix panfrost include
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Fully decouple midgard_cmdline from freedreno

This enables the module to compile, providing stubs for the NIR
compiler.
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Fix panfrost dependency
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

[midgard] Dump NIR and remove unnecessary passes
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Further reduce midgard
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Iterate NIR instructions

Further simplification
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Ditto
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Trace out emit path for load_const

Store output intrinsic
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Also vertex shaders
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Lower var copies
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Load uniform stub
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

String through compiler context

Learn how to use util_dynarray for current_block

Import midgard shader defines by Connor Abbott

These were found in the original Midgard disassemble by cwabbott,
extracted from the project cwabbots-open-gpu-tools under the license
stated. They will be used here for instruction emission in the Midgard
compiler.
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Iterate midgard instruction types

Remove type, next_type from load_store_t

Instruction type tags

Compute instruction lookahead

Refactor get_lookahead_type

Fix lookahead by lowering tag format
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Fill in part of load_uniform, other ALU tags, etc
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Dump load_store op

Macro for load_uniform instructions

Use for store_vary32 as well

Register aliases

reg, offset arguments to load_store

Hack until we have initial output :)

Swizzle macro

Factor out emit_binary_instruction

Refactor file I/O

Begin emitting ALU ops

ALU padding

I misunderstood padding; fix it

Demonstrate some tacked on constants

Set sources

Move ALU register work

String through constants

Correct registers

Use correct register in fmov

Refactor into M_ALU macro

ALU_2

Factor out attach_constants

Remove print

Emit ALU

Fixes to '

Make register resolution at least somewhat plausible

Remove some debugging prints

ALU source modifiers

EMIT_ALU_CASE to macro

fmul

fmin, fmax

load_vary

Fix src

Shader stage to differentiate varying/attrib load

Algebraic pass

Actual optimisation loop

Import full list of known ALU opcodes

Emit for remaining ALU ops (where possible)

Update ALU ops

Disable incorrect fsin/fcos for now

Correctly implement sin and cos, extending NIR

Explain midgard_instruction in relation to scheduler

Any configuration in load_const is okay

Comment half floats

Don't break aliasing rules

Begin eliminate_constant_mov pass

Finish mov elimination

Use raw SSA in the midgard compiler

Register allocate stub

fmov elimination is much easier in SSA space

Switch to /dev/shm

Try hash

Search for constants

Attach maybe

I feel silly -- fix move elimination

Update compiler options

Reflow constant move loop

Pair load/store instructions

Don't introduce a dependency chain

Correct fmov argument ordering

[midgard] Disable vertex shader compilation

The vertex shader epilogue for these GPUs is not yet well understood;
it's not worth trying to compile for it quite yet.

[midgard] FMA does not exist for GL

[midgard] Lowering vecs to movs will be useful

[midgard] Fix fmov instruction ordering

[midgard] Properly noop load/stores

midgard: Introduce synthwrite to catch gl_FragColor
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Stub framebuffer write
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Introduce variadic EMIT syntax sugar
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Second half of the fbwrite
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Literal out for proper fbwrite
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Use actual compact writeout fields
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Begin ALU op ombining
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Continue ALU combining work

midgard: Cleanup printfs
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: ALU combining
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Instruction-combining aware lookahead
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Register allocation position
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Workaround missing preliminary load/store errata
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Synthwrite was a mistake
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Fix warnings
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Basic uniform loading support
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Set unknown field in varying load

Saner load varying

midgard: Use adder for add instructions
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Rework load_input, etc to act like vc4/freedreno
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Alias imov to fmov
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Fix store out regrssion
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Begin scalar work
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Don't lower fsat

midgard: Fix build
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Lower to source mods pass
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Saturation arithmetic
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

Refactor ALU emit to allow for scalar emit in future

Remove unnecessary alu defs

Allow scalar ops to be emitted

midgard: Implement scalar_alu_modifiers

Correct swizzle placement

midgard: Correct order

midgard: Account for scalar component special case

midgard: Sort out memory safety regression from scalar refactor

midgard: vlut mask

midgard: Begin porting over vec4 pass from freedreno

midgard: Fix vec4

midgard: Remove deadcode

Fix frcp support

midgard: Fix bugs with scalar source modifiers

midgard: Lower subtraction

midgard: Begin debugging transcendental functions

midgard: Proper SSA register aliasing

midgard: General improvements relating to unused arguments

midgard: Reenable vertex and disable double print

midgard: Only emit fragment epilogue for fragment shaders

midgard: Load attribute

midgard: Assign var locations

midgard: Front-half of SSA aliases

midgard: Further progress on aliasing

midgard: Optimise uniforms similarly

midgard: Fix uniform special case

midgard: Cleanup uniform aliasing

midgard: Cleanup warnings

midgard: Fix nondeterministic segfault

midgard: Fix regression packing with unuseds

midgard: Fix regression in regression fix

midgard: Begin store vary emit

midgard: Begin experimenting with nir_builder

midgard: Write to special register from epilogue

midgard: Load gl_Position in vertex epilogue

midgard: Fix bug in aliasing implementation

midgard: Further hack on vertex shader epilogue

midgard: Defer stores to workaround hw errata?
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

midgard: Fix early constant inline termination

Cut off duplicated embedded constants

midgard: Move vertex epilogue to after var assignment

midgard: Import ugly internal code to fix vertex shader epilogue

midgard: Get vertex shaders working.... somehow

midgard: Reenable fragment compilation

midgard: Fix load/store noop emission

Save real softpipe

panfrost: Dump clears

midgard: Workaround compact branch errata

panfrost: XXX Hack in the trans library XXX

Hook into panfrost, uglily

Continue hacky panfrost integration

panfrost: Begin ripping out drawing to enable shaders

Begin interfacing with the hacky resource stuff

Link in transfer map

Hook up vertex functions

Disable user buffers for now

Solve some segfaults

transfer_unmap

Don't crash

Work fixing varying writes

Remove vertex epilogue varying magic

Proceed implementing vertex 'epilogue' the Right way

Remove cruft that has built up from previous refactor

Update comments; nir_instr_remove old st_vary

Remove now-unused defer_stores

Remove redunant r0 move

Note about the decaying issue

Fix data hazard determination for ld/st pairing

Finally get eliminate_varying_mov working nicely

Cleanup from previous commit

Dot products

Call do_mat_op_to_vec

Wrap do_mat_op_to_vec

Get uniforms doing something somewhat sane

Fix uniform access patterns

Galliumify set_constant_buffer

Cleanup comments

Inline n2m_alu_outmod

Compiler cleanup

Begin watermark RA

Fixes for watermark RA

Proceed writing real RA?

Get RA to work

Quiet output

Add some profiling stubs

Remove redunant lower_io calls

New information re varying registers

Honour literal_out in ls4

Implement vertex epilogue as per 12.5.1

Perspective division

Uniforms are backward; workaround buggy VLIW

Fix crash on resource destory (mesa half)

Remove softpipeism

Work towards correct resizable shm windows

Map the surface in the right place

Continue

Remove what we can

Remove more

Cut more

Strip further

Continue

ACCELERATED flag

Remove

Strip shaders

Fix overzealous inline constants

Encode inline vector constants

Mark errata with ERRATA, not XXX

Enable two instruction chains instead of one

Embedded constants with ALU combining (fixes long-time regression)

Bundle duplicate constants

Cull ssa0 moves (missed from inline constant in luts)

Embedded to inline constant for right-constant scalar ops

Scalar op flip

Remove prints

Inline constants in vector ops

Begin work on instruction unit switching

Branch compact can be packed

Continue unit hopping work

Split out helpers to prepare for updating midgard.h

Pull in new midgard.h from SPD

f2i->u

Basic support for integers

Disable inline constants for the moment, since they're broken

inot requires MUL apparently

Import new ops

Emit ball/bany from NIR

Import backend algebraic NIR pass stuff

nir: Implement optional b2f->iand lowering

This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.
Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>

f2b, b2f in midgard

Small cleanup; fix floor/ceil

LUT duplication

Guarantee proper fragment writeout (incurring a temporary performance regression)

Begin working on csel stuff

midgard: Move fsinpi stuff to backend-specific pass

Reenable embedded_to_inline_constant by making it integer aware

Fix constant attaching

ushr opcode

Fix issue with imin/imax blocking

Remove prints

Componentwise test for r0 breakup

Try to debug

When flipping arguments, also flip modifiers

Lower b2i to iand

Fix segfault with inot

Flip vector constants

isub is not commutative

fne _is_ commutative

Remove prints

Get rid of constant moves -- unnecessary complexity

Remove STAGE_PROFILING

Uniform base is no longer needed

Remove unused macro

Enable basic nir_register support in order to chuck out old vec4 pass

Call convert_from_ssa weakly and generalise to registers in LUT duplication

Fix st_vary input bug triggered by vertex epilogue refactor

Mask for clarity

Remove whitespace

Fix annoying compiler segfault

Reenable constant inlining (unaffected by registerisation)

Fix varying move regresison and reenable

Stubs to emit textures from NIR

Begin basic texture op emission

Get texture handles correct

Set flags

Set .cont and .last

Hardcode mask/filter for now

Hardcode a swizzle as well

Force texture full for now

Do something with the input swizzle

Fix spelling error in header

midgard: Emit fmov for source/dest texture

midgard: Lower vars as necessary

Rescale for the replay :v

Handle weird 3D texture swizzle

Stub for cubemap

Hook up texture/sampler functions in softpipe shim

Don't advertise compute/geometry shaders

Import softpipe meson.build into panfrost

Move shim into ~/panfrost

Include panfrost_dri.so

Register as fake swr

Use the panfrost name

Restore original softpipe
parent b2653007
......@@ -132,6 +132,7 @@ with_gallium_r300 = false
with_gallium_r600 = false
with_gallium_nouveau = false
with_gallium_freedreno = false
with_gallium_panfrost = false
with_gallium_softpipe = false
with_gallium_vc4 = false
with_gallium_vc5 = false
......@@ -149,7 +150,7 @@ if _drivers == 'auto'
if ['x86', 'x86_64'].contains(host_machine.cpu_family())
_drivers = 'r300,r600,radeonsi,nouveau,virgl,svga,swrast'
elif ['arm', 'aarch64'].contains(host_machine.cpu_family())
_drivers = 'pl111,vc4,vc5,freedreno,etnaviv,imx,nouveau,tegra,virgl,svga,swrast'
_drivers = 'pl111,vc4,vc5,panfrost,freedreno,etnaviv,imx,nouveau,tegra,virgl,svga,swrast'
else
error('Unknown architecture. Please pass -Dgallium-drivers to set driver options. Patches gladly accepted to fix this.')
endif
......@@ -167,6 +168,7 @@ if _drivers != ''
with_gallium_r600 = _split.contains('r600')
with_gallium_nouveau = _split.contains('nouveau')
with_gallium_freedreno = _split.contains('freedreno')
with_gallium_panfrost = _split.contains('panfrost')
with_gallium_softpipe = _split.contains('swrast')
with_gallium_vc4 = _split.contains('vc4')
with_gallium_vc5 = _split.contains('vc5')
......
......@@ -1862,6 +1862,9 @@ typedef struct nir_shader_compiler_options {
/** enables rules to lower idiv by power-of-two: */
bool lower_idiv;
/* lower b2f to iand */
bool lower_b2f;
/* Does the native fdot instruction replicate its result for four
* components? If so, then opt_algebraic_late will turn all fdotN
* instructions into fdot_replicatedN instructions.
......
......@@ -214,7 +214,6 @@ unop("fquantize2f16", tfloat, "(fabs(src0) < ldexpf(1.0, -14)) ? copysignf(0.0f,
unop("fsin", tfloat, "bit_size == 64 ? sin(src0) : sinf(src0)")
unop("fcos", tfloat, "bit_size == 64 ? cos(src0) : cosf(src0)")
# Partial derivatives.
......
......@@ -318,7 +318,8 @@ optimizations = [
(('imul', ('b2i', a), ('b2i', b)), ('b2i', ('iand', a, b))),
(('fmul', ('b2f', a), ('b2f', b)), ('b2f', ('iand', a, b))),
(('fsat', ('fadd', ('b2f', a), ('b2f', b))), ('b2f', ('ior', a, b))),
(('iand', 'a@bool', 1.0), ('b2f', a)),
(('iand', 'a@bool', 1.0), ('b2f', a), '!options->lower_b2f'),
(('b2f@32', a), ('iand', a, 1.0), 'options->lower_b2f'),
# True/False are ~0 and 0 in NIR. b2i of True is 1, and -1 is ~0 (True).
(('ineg', ('b2i@32', a)), a),
(('flt', ('fneg', ('b2f', a)), 0), a), # Generated by TGSI KILL_IF.
......
......@@ -1238,7 +1238,7 @@ dri2_initialize_x11_swrast(_EGLDriver *drv, _EGLDisplay *disp)
* Every hardware driver_name is set using strdup. Doing the same in
* here will allow is to simply free the memory at dri2_terminate().
*/
dri2_dpy->driver_name = strdup("swrast");
dri2_dpy->driver_name = strdup("panfrost");
if (!dri2_load_driver_swrast(disp))
goto cleanup;
......
......@@ -18,6 +18,10 @@
#include "softpipe/sp_public.h"
#endif
#ifdef GALLIUM_PANFROST
#include "panfrost/sp_public.h"
#endif
#ifdef GALLIUM_LLVMPIPE
#include "llvmpipe/lp_public.h"
#endif
......@@ -55,6 +59,11 @@ sw_screen_create_named(struct sw_winsys *winsys, const char *driver)
screen = swr_create_screen(winsys);
#endif
#if defined(GALLIUM_PANFROST)
if (screen == NULL && strcmp(driver, "panfrost") == 0)
screen = panfrost_create_screen(winsys);
#endif
return screen;
}
......@@ -71,6 +80,8 @@ sw_screen_create(struct sw_winsys *winsys)
default_driver = "softpipe";
#elif defined(GALLIUM_SWR)
default_driver = "swr";
#elif defined(GALLIUM_PANFROST)
default_driver = "panfrost";
#else
default_driver = "";
#endif
......
......@@ -24,6 +24,10 @@
#include "swr/swr_public.h"
#endif
#ifdef GALLIUM_PANFROST
#include "panfrost/sp_public.h"
#endif
#ifdef GALLIUM_VIRGL
#include "virgl/virgl_public.h"
#include "virgl/vtest/virgl_vtest_public.h"
......@@ -57,6 +61,12 @@ sw_screen_create_named(struct sw_winsys *winsys, const char *driver)
screen = swr_create_screen(winsys);
#endif
#if defined(GALLIUM_PANFROST)
printf("Hai\n");
if (screen == NULL && strcmp(driver, "panfrost") == 0)
screen = panfrost_create_screen(winsys);
#endif
return screen;
}
......@@ -73,6 +83,8 @@ sw_screen_create(struct sw_winsys *winsys)
default_driver = "softpipe";
#elif defined(GALLIUM_SWR)
default_driver = "swr";
#elif defined(GALLIUM_PANFROST)
default_driver = "panfrost";
#else
default_driver = "";
#endif
......
# Copyright © 2017 Intel Corporation
# Copyright © 2018 Alyssa Rosenzweig
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
files_panfrost = files(
'sp_clear.c',
'sp_clear.h',
'sp_context.c',
'sp_context.h',
'sp_draw_arrays.c',
'sp_fence.c',
'sp_fence.h',
'sp_flush.c',
'sp_flush.h',
'sp_fs.h',
'sp_limits.h',
'sp_public.h',
'sp_query.c',
'sp_state_shader.c',
'sp_query.h',
'sp_screen.c',
'sp_screen.h',
'sp_state_blend.c',
'sp_state_clip.c',
'sp_state.h',
'sp_state_sampler.c',
'sp_state_rasterizer.c',
'sp_state_so.c',
'sp_state_surface.c',
'sp_state_vertex.c',
'sp_surface.c',
'sp_surface.h',
'sp_texture.c',
'sp_texture.h',
'/home/guest/panloader/trans/pandev.c',
'/home/guest/panloader/trans/allocate.c',
'/home/guest/panloader/trans/assemble.c',
'/home/guest/panloader/trans/slow-framebuffer.c',
'/home/guest/panloader/trans/trans-builder.c',
)
libpanfrost = static_library(
'panfrost',
files_panfrost,
dependencies: [cc.find_library('X11', required: true)],
include_directories : [inc_gallium_aux, inc_gallium, inc_include, inc_src, include_directories('/home/guest/panloader/trans'), include_directories('/home/guest/panloader/include'), include_directories('/home/guest/panloader/build/include')],
c_args : [c_vis_args, c_msvc_compat_args],
)
driver_panfrost = declare_dependency(
compile_args : '-DGALLIUM_PANFROST',
link_with : libpanfrost
)
midgard_nir_algebraic_c = custom_target(
'midgard_nir_algebraic.c',
input : 'midgard/midgard_nir_algebraic.py',
output : 'midgard_nir_algebraic.c',
command : [
prog_python2, '@INPUT@',
'-p', join_paths(meson.source_root(), 'src/compiler/nir/'),
],
capture : true,
depend_files : nir_algebraic_py,
)
files_midgard = files(
'midgard/midgard_cmdline.c',
'midgard/cppwrap.cpp',
)
midgard_compiler = executable(
'midgard_compiler',
[files_midgard, midgard_nir_algebraic_c],
include_directories : [inc_common, inc_src, inc_include, inc_gallium, inc_gallium_aux, include_directories('midgard')],
dependencies : [
dep_thread,
idep_nir
],
link_with : [
libgallium,
libglsl_standalone,
libmesa_util
],
build_by_default : true
)
struct exec_list;
bool do_mat_op_to_vec(struct exec_list *instructions);
extern "C" {
bool c_do_mat_op_to_vec(struct exec_list *instructions) {
return do_mat_op_to_vec(instructions);
}
};
/* Author(s):
* Alyssa Rosenzweig
*
* Copyright (c) 2018 Alyssa Rosenzweig (alyssa@rosenzweig.io)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
/* Some constants and macros not found in the disassembler */
#define OP_IS_STORE(op) (\
op == midgard_op_store_vary_16 || \
op == midgard_op_store_vary_32 \
)
/* ALU control words are single bit fields with a lot of space */
#define ALU_ENAB_VEC_MUL (1 << 17)
#define ALU_ENAB_SCAL_ADD (1 << 19)
#define ALU_ENAB_VEC_ADD (1 << 21)
#define ALU_ENAB_SCAL_MUL (1 << 23)
#define ALU_ENAB_VEC_LUT (1 << 25)
#define ALU_ENAB_BR_COMPACT (1 << 26)
#define ALU_ENAB_BRANCH (1 << 27)
/* Vector-independant shorthands for the above; these numbers are arbitrary and
* not from the ISA. Convert to the above with unit_enum_to_midgard */
#define UNIT_MUL 0
#define UNIT_ADD 1
#define UNIT_LUT 2
/* 4-bit type tags */
#define TAG_TEXTURE_4 0x3
#define TAG_LOAD_STORE_4 0x5
#define TAG_ALU_4 0x8
#define TAG_ALU_8 0x9
#define TAG_ALU_12 0xA
#define TAG_ALU_16 0xB
/* Special register aliases */
#define MAX_WORK_REGISTERS 16
/* Uniforms are begin at (REGISTER_UNIFORMS - uniform_count) */
#define REGISTER_UNIFORMS 24
#define REGISTER_UNUSED 24
#define REGISTER_CONSTANT 26
#define REGISTER_VARYING_BASE 26
#define REGISTER_OFFSET 27
#define REGISTER_TEXTURE_BASE 28
#define REGISTER_SELECT 31
/* SSA helper aliases to mimic the registers. UNUSED_0 encoded as an inline
* constant. UNUSED_1 encoded as REGISTER_UNUSED */
#define SSA_UNUSED_0 0
#define SSA_UNUSED_1 -2
#define SSA_FIXED_SHIFT 24
#define SSA_FIXED_REGISTER(reg) ((1 + reg) << SSA_FIXED_SHIFT)
#define SSA_REG_FROM_FIXED(reg) ((reg >> SSA_FIXED_SHIFT) - 1)
#define SSA_FIXED_MINIMUM SSA_FIXED_REGISTER(0)
/* Swizzle support */
#define SWIZZLE(A, B, C, D) ((D << 6) | (C << 4) | (B << 2) | (A << 0))
#define SWIZZLE_FROM_ARRAY(r) SWIZZLE(r[0], r[1], r[2], r[3])
#define COMPONENT_X 0x0
#define COMPONENT_Y 0x1
#define COMPONENT_Z 0x2
#define COMPONENT_W 0x3
/* Output writing "condition" for the branch (all one's) */
#define COND_FBWRITE 0x3
/* See ISA notes */
#define LDST_NOP (3)
/* Is this opcode that of an integer? */
static bool
midgard_is_integer_op(int op)
{
switch (op) {
case midgard_alu_op_iadd:
case midgard_alu_op_ishladd:
case midgard_alu_op_isub:
case midgard_alu_op_imul:
case midgard_alu_op_imin:
case midgard_alu_op_imax:
case midgard_alu_op_iasr:
case midgard_alu_op_ilsr:
case midgard_alu_op_ishl:
case midgard_alu_op_iand:
case midgard_alu_op_ior:
case midgard_alu_op_inot:
case midgard_alu_op_iandnot:
case midgard_alu_op_ixor:
case midgard_alu_op_imov:
//case midgard_alu_op_f2i:
//case midgard_alu_op_f2u:
case midgard_alu_op_ieq:
case midgard_alu_op_ine:
case midgard_alu_op_ilt:
case midgard_alu_op_ile:
case midgard_alu_op_iball_eq:
case midgard_alu_op_ibany_neq:
//case midgard_alu_op_i2f:
//case midgard_alu_op_u2f:
case midgard_alu_op_icsel:
return true;
default:
return false;
}
}
/* Author(s):
* Connor Abbott
* Alyssa Rosenzweig
*
* Copyright (c) 2013 Connor Abbott (connor@abbott.cx)
* Copyright (c) 2018 Alyssa Rosenzweig (alyssa@rosenzweig.io)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef __midgard_h__
#define __midgard_h__
#include <stdint.h>
#include <stdbool.h>
typedef enum
{
midgard_word_type_alu,
midgard_word_type_load_store,
midgard_word_type_texture,
midgard_word_type_unknown
} midgard_word_type;
typedef enum
{
midgard_alu_vmul,
midgard_alu_sadd,
midgard_alu_smul,
midgard_alu_vadd,
midgard_alu_lut
} midgard_alu;
/*
* ALU words
*/
typedef enum
{
midgard_alu_op_fadd = 0x10,
midgard_alu_op_fmul = 0x14,
midgard_alu_op_fmin = 0x28,
midgard_alu_op_fmax = 0x2C,
midgard_alu_op_fmov = 0x30,
midgard_alu_op_ffloor = 0x36,
midgard_alu_op_fceil = 0x37,
midgard_alu_op_fdot3 = 0x3C,
midgard_alu_op_fdot3r = 0x3D,
midgard_alu_op_fdot4 = 0x3E,
midgard_alu_op_freduce = 0x3F,
midgard_alu_op_iadd = 0x40,
midgard_alu_op_ishladd = 0x41,
midgard_alu_op_isub = 0x46,
midgard_alu_op_imul = 0x58,
midgard_alu_op_imin = 0x60,
midgard_alu_op_imax = 0x62,
midgard_alu_op_iasr = 0x68,
midgard_alu_op_ilsr = 0x69,
midgard_alu_op_ishl = 0x6E,
midgard_alu_op_iand = 0x70,
midgard_alu_op_ior = 0x71,
midgard_alu_op_inot = 0x72,
midgard_alu_op_iandnot = 0x74, /* (a, b) -> a & ~b, used for not/b2f */
midgard_alu_op_ixor = 0x76,
midgard_alu_op_imov = 0x7B,
midgard_alu_op_feq = 0x80,
midgard_alu_op_fne = 0x81,
midgard_alu_op_flt = 0x82,
midgard_alu_op_fle = 0x83,
midgard_alu_op_fball_eq = 0x88,
midgard_alu_op_bball_eq = 0x89,
midgard_alu_op_bbany_neq = 0x90, /* used for bvec4(1) */
midgard_alu_op_fbany_neq = 0x91, /* bvec4(0) also */
midgard_alu_op_f2i = 0x99,
midgard_alu_op_f2u = 0x9D,
midgard_alu_op_ieq = 0xA0,
midgard_alu_op_ine = 0xA1,
midgard_alu_op_ilt = 0xA4,
midgard_alu_op_ile = 0xA5,
midgard_alu_op_iball_eq = 0xA8,
midgard_alu_op_ball = 0xA9,
midgard_alu_op_ibany_neq = 0xB1,
midgard_alu_op_i2f = 0xB8,
midgard_alu_op_u2f = 0xBC,
midgard_alu_op_icsel = 0xC1,
midgard_alu_op_fcsel = 0xC5,
midgard_alu_op_fatan_pt2 = 0xE8,
midgard_alu_op_frcp = 0xF0,
midgard_alu_op_frsqrt = 0xF2,
midgard_alu_op_fsqrt = 0xF3,
midgard_alu_op_fexp2 = 0xF4,
midgard_alu_op_flog2 = 0xF5,
midgard_alu_op_fsin = 0xF6,
midgard_alu_op_fcos = 0xF7,
midgard_alu_op_fatan2_pt1 = 0xF9,
} midgard_alu_op;
typedef enum
{
midgard_outmod_none = 0,
midgard_outmod_pos = 1,
midgard_outmod_int = 2,
midgard_outmod_sat = 3
} midgard_outmod;
typedef enum
{
midgard_reg_mode_half = 1,
midgard_reg_mode_full = 2
} midgard_reg_mode;
typedef enum
{
midgard_dest_override_lower = 0,
midgard_dest_override_upper = 1,
midgard_dest_override_none = 2
} midgard_dest_override;
typedef struct
__attribute__((__packed__))
{
bool abs : 1;
bool negate : 1;
/* replicate lower half if dest = half, or low/high half selection if
* dest = full
*/
bool rep_low : 1;
bool rep_high : 1; /* unused if dest = full */
bool half : 1; /* only matters if dest = full */
unsigned swizzle : 8;
} midgard_vector_alu_src;
typedef struct
__attribute__((__packed__))
{
midgard_alu_op op : 8;
midgard_reg_mode reg_mode : 2;
unsigned src1 : 13;
unsigned src2 : 13;
midgard_dest_override dest_override : 2;
midgard_outmod outmod : 2;
unsigned mask : 8;
} midgard_vector_alu;
typedef struct
__attribute__((__packed__))
{
bool abs : 1;
bool negate : 1;
bool full : 1; /* 0 = half, 1 = full */
unsigned component : 3;
} midgard_scalar_alu_src;
typedef struct
__attribute__((__packed__))
{
midgard_alu_op op : 8;
unsigned src1 : 6;
unsigned src2 : 11;
unsigned unknown : 1;
midgard_outmod outmod : 2;
bool output_full : 1;
unsigned output_component : 3;
} midgard_scalar_alu;
typedef struct
__attribute__((__packed__))
{
unsigned src1_reg : 5;
unsigned src2_reg : 5;
unsigned out_reg : 5;
bool src2_imm : 1;
} midgard_reg_info;
typedef enum
{
midgard_jmp_writeout_op_branch_uncond = 1,
midgard_jmp_writeout_op_branch_cond = 2,
midgard_jmp_writeout_op_discard = 4,
midgard_jmp_writeout_op_writeout = 7,
} midgard_jmp_writeout_op;
typedef struct
__attribute__((__packed__))
{
midgard_jmp_writeout_op op : 3; /* == branch_uncond */
unsigned dest_tag : 4; /* tag of branch destination */
unsigned unknown : 2;
int offset : 7;
} midgard_branch_uncond;
typedef struct
__attribute__((__packed__))
{
midgard_jmp_writeout_op op : 3; /* == branch_cond */
unsigned dest_tag : 4; /* tag of branch destination */
int offset : 7;
unsigned cond : 2;
} midgard_branch_cond;
typedef struct
__attribute__((__packed__))
{
midgard_jmp_writeout_op op : 3; /* == writeout */
unsigned unknown : 13;
} midgard_writeout;
/*
* Load/store words
*/
typedef enum
{
midgard_op_ld_st_noop = 0x03,
midgard_op_load_attr_16 = 0x95,
midgard_op_load_attr_32 = 0x94,
midgard_op_load_vary_16 = 0x99,
midgard_op_load_vary_32 = 0x98,
midgard_op_load_color_buffer_16 = 0x9D,
midgard_op_load_uniform_16 = 0xAC,
midgard_op_load_uniform_32 = 0xB0,
midgard_op_store_vary_16 = 0xD5,
midgard_op_store_vary_32 = 0xD4
} midgard_load_store_op;
typedef enum
{
midgard_interp_centroid = 1,
midgard_interp_default = 2
} midgard_interpolation;
typedef struct
__attribute__((__packed__))
{
midgard_load_store_op op : 8;
unsigned reg : 5;
unsigned mask : 4;
unsigned swizzle : 8;
unsigned unknown : 16;
unsigned unknown0_1 : 4; /* Always zero */
/* Varying qualifiers, zero if not a varying */
unsigned flat : 1;
unsigned is_varying : 1; /* Always one for varying, but maybe something else? */
midgard_interpolation interpolation : 2;
unsigned unknown0_2 : 2; /* Always zero */
unsigned address : 9;
} midgard_load_store_word;
typedef struct
__attribute__((__packed__))
{
unsigned type : 4;
unsigned next_type : 4;
uint64_t word1 : 60;
uint64_t word2 : 60;
} midgard_load_store;
/* Texture pipeline results are in r28-r29 */
#define REG_TEX_BASE 28
/* Texture opcodes... maybe? */
#define TEXTURE_OP_NORMAL 0x11
#define TEXTURE_OP_TEXEL_FETCH 0x14
/* Texture format types, found in format */
#define TEXTURE_CUBE 0x00
#define TEXTURE_2D 0x02
#define TEXTURE_3D 0x03
typedef struct
__attribute__((__packed__))
{
unsigned type : 4;
unsigned next_type : 4;
unsigned op : 6;
unsigned shadow : 1;
unsigned unknown3 : 1;
/* A little obscure, but last is set for the last texture operation in
* a shader. cont appears to just be last's opposite (?). Yeah, I know,
* kind of funky.. BiOpen thinks it could do with memory hinting, or
* tile locking? */
unsigned cont : 1;
unsigned last : 1;
unsigned format : 5;
unsigned has_offset : 1;
/* Like in Bifrost */
unsigned filter : 1;
unsigned in_reg_select : 1;
unsigned in_reg_upper : 1;