Commit 7e4016a6 authored by Connor Abbott's avatar Connor Abbott
Browse files

bifrost: Document branch convergence/divergence more thoroughly

parent f5771004
......@@ -341,6 +341,41 @@ The "Branch Conditional" bit is always set if the "Back to Back" field is set to
The "Data Register Write Barrier" is set when the next clause writes to the data register of some previous clause.
=== Branching
Bifrost is a SIMT architecture, which means that the hardware executes 4 (or later 8) threads at the same time. Divergence and reconvergence is mostly handled automatically by the hardware, but it needs some software help via the "Back to Back" bit.
In addition to the scalar IP, which holds the currently executing clause, and the exec mask which says which threads are currently active, there's the vector IP. For active threads the vector IP is conceptually the same as the scalar IP, however for inactive threads it holds *the address at which the thread re-converges*. Furthermore, bifrost maintains the invariant that *the scalar IP is always the minumum IP of all threads*. That is, the non-active threads all have a greater IP than the active threads. Thus, conceptually after each clause is finished the hardware does the following:
1. Figure out the new IP for each active thread, and update the vector IP.
2. Calculate the minimum IP, set that as the scalar IP, and update the exec mask to contain only threads with the minimum IP.
However, most of the time this isn't necessary. Usually control flows directly from clause to clause, without the exec mask changing. In this case we don't actually have to update the vector IP or exec mask at all. We can just update the scalar IP, only resolving the vector IP when there is non-trivial control flow. This is what the "Back-to-Back" bit is for. It is set when the scalar IP is always normally incremented and the exec mask is unchanged after the clause finishes.
Note that "Back-to-Back" can't always be set, even if the clause doesn't include a branch. For example, imagine this pseudo-assembly for a simple if construct:
[source]
----
clause_0: nbb {
...
BRANCH cond, #clause_3
}
clause_1: {
...
}
clause_2: nbb {
...
}
clause_3: {
...
}
----
In addition to `clause_0` which ends in a branch, `clause_2` is also not back-to-back because threads which took the branch may re-converge at `clause_3` following it. These are "fallthrough branches", and must have the "branch conditional" bit set in addition to normal conditional branches.
=== Register field
A lot of variable-latency instructions have to interact with the register file in ways that would be awkward to express in the usual manner, i.e. with the per-instruction register field. For example, the STORE instruction has to read up to 4 32-bit registers, which the usual pathways for reading a register can't handle -- they're designed for reading up to three 32-bit or 64-bit registers each cycle, and it also needs to load a 64-bit address from registers. The LOAD instruction can't write to the register until the operation has finished, possibly well after the instruction executes. For cases like these, there's a "register" field in the clause header that lets the variable-latency instruction read/write one, or a sequence of, registers, using a completely different mechanism. Since there can only be one variable-latency instruction per clause, this field isn't ambiguous about which instruction it applies to. When the variable-latency instruction is supposed to read or write more than one register (e.g. `LOAD.v4i32`), they are read or written in sequence starting with the register specified. There are no restrictions on which register to specifiy, except that the reads/writes cannot go out of bounds of the register file (so a `LOAD.v4i32` with a data register of `R63` would result in a fault, since it would try to write `R63` through `R66`). The blob compiler will only use aligned register pairs and quads, but this doesn't seem to be necessary.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment