Daft: clover: implement CLOVER_DEVICE_ENABLE like AMD APP's GPU_DEVICE_ORDINAL
AMD APP (ROCm, PAL, Orca, fglrx) implements GPU_DEVICE_ORDINAL
environment variable as a comma-separated list of OpenCL device
id numbers.
This implements the same feature for Clover using the
CLOVER_DEVICE_ENABLE
environment variable.
Example:
CLOVER_DEVICE_ENABLE='0,3' clinfo --list
It is not named GPU_DEVICE_ORDINAL
because it is required for
the environment variable to be specific to the OpenCL platform,
or one cannot enable a device with a platform and enable it
with another one.
As an example, Both ROCm and PAL attempts to support the
Hawaii GPU but ROCm is faulty and wrecks the kernel, the
user can use GPU_DEVICE_ORDINAL
to prevent ROCm to drive
the Hawaii GPU and prevent a kernel wreckage, but since
all AMD platforms use the same environment variable name
it also prevents PAL to drives the Hawaii GPU. So one
cannot host a Hawaii GPU and a Vega GPU on the same system
by enabling Vega and disabling Hawaii in ROCm and enabling
Hawaii in PAL as GPU_DEVICE_ORDINAL
will disable Hawaii in
both platforms.
The platform-specific variable name is meant to not reproduce that mistake.
So the variable is named CLOVER_DEVICE_ENABLE
in a way
we can implement the same feature with rusticl by naming
the variable RUSTICL_DEVICE_ENABLE
, so we can both
enable GPU A in Clover and disable GPU B in rusticl while
disabling GPU A in Clover and enabling GPU B in rusticl.
The variable name doesn't mention any device type (like GPU
)
on purpose, it is meant to enable/disable any device from a
same platform whatever the device type.
The proposed variable name does not make use of the ORDINAL
word because it may be possible to extend the format in the
future to list hardware addresses or other identifiers that
are more predictable. Implementing the feature with id numbers
seems to be good enough for now.