• Neil Roberts's avatar
    glsl: Add an IR lowering pass to convert mediump operations to 16-bit · b83f4b9f
    Neil Roberts authored
    This works by finding the first rvalue that it can lower using an
    ir_rvalue_visitor. In that case it adds a conversion to float16
    after each rvalue and a conversion back to float before storing
    the assignment.
    Also it uses a set to keep track of rvalues that have been
    lowred already. The handle_rvalue method of the rvalue visitor doesn’t
    provide any way to stop iteration. If we handle a value in
    find_precision_visitor we want to be able to stop it from descending into
    the lowered rvalue again.
    Additionally this pass disallows converting nodes containing non-float.
    The can_lower_rvalue function explicitly excludes any branches
    that have non-float types except bools. This avoids the need to have
    special handling for functions that convert to int or double.
    Co-authored-by: Hyunjun Ko's avatarHyunjun Ko <zzoon@igalia.com>
    v2. Adds lowering for texture samples
    v3. Instead of checking whether each node can be lowered while walking the
    tree, a separate tree walk is now done to check all of the nodes in a
    single pass. The lowerable nodes are added to a set which is checked
    during find_precision_visitor instead of calling can_lower_rvalue.
    v4. Move the special case for temporaries to find_lowerable_rvalues. This
    needs to be handled while checking for lowerable rvalues so that any
    later dereferences of the variable will see the right precision.
    v5. Add an override to visit ir_call instructions and apply the same
    technique to override the precision of the temporary variable in the
    same way as done for builtin temporaries and ir_assignment calls.
    v6. Changes the pass so that it doesn’t need to lower an entire subtree in
    order do perform a lowering. Instead, certain instructions can be
    marked as being indepedent of their child instructions. For example,
    this is the case with array dereferences. The precision of the array
    index doesn’t have any bearing on whether things using the result of
    the array deref can be lowered.
    Now, only toplevel lowerable nodes are added to the lowerable_rvalues
    instead instead of additionally adding all of the subnodes.
    It now also only needs one hash table instead of two.
    v7. Don’t try to lower sampler types. Instead, the sample instruction is
    now treated as an independent point where the result of the sample can
    be used in a lowered section. The precision of the sampler type
    determines the precision of the sample instruction. This also means
    the coordinates to the sampler can be lowered.
    v8. Use f2fmp instead of f2f16.
    v9.  Disable lowering derivatives calcualtions, which might not work
    properly on some hw backends.
    Reviewed-by: Kristian H. Kristensen's avatarKristian H. Kristensen <hoegsberg@google.com>
    Part-of: <!3885>
st_extensions.c 72.2 KB