Unverified Commit 704d34d0 authored by Connor Abbott's avatar Connor Abbott Committed by GitHub
Browse files

bifrost: Finish a sentence

parent 8996caa0
...@@ -31,7 +31,7 @@ Note that in addition to the execution unit, which we've been describing, there ...@@ -31,7 +31,7 @@ Note that in addition to the execution unit, which we've been describing, there
The execution unit interacts with these fixed-function blocks through special, variable-latency instructions in the FMA and ADD units. They bypass the usual, fixed-latency mechanism for reading/writing registers, and as such instructions in the same clause can't assume that registers have been read/written after the instruction is done (TODO: verify this for registers being read, and not just written). Instead, any dependent instructions must be put into a separate clause with the appropriate dependencies in the clause header set. The execution unit interacts with these fixed-function blocks through special, variable-latency instructions in the FMA and ADD units. They bypass the usual, fixed-latency mechanism for reading/writing registers, and as such instructions in the same clause can't assume that registers have been read/written after the instruction is done (TODO: verify this for registers being read, and not just written). Instead, any dependent instructions must be put into a separate clause with the appropriate dependencies in the clause header set.
= Clauses = Clauses
Conceptually, each clause consists of a clause header, followed by one or more 78-bit instruction words and then zero or more 60-bit constants. (Constants are actually 64 bits, but they're loaded the same port as uniform registers and share the same field in the instruction word, which includes 7 bits to choose which uniform register to load, some of which would be unused for constants, so ARM decided to be clever and stick the bottom 4 bits in each instruction where the constant is loaded, so the actual constants in the instruction stream are only 60 bits). But the instruction fetching hardware only works in 128-bit quadwords, so each clause has to be a multiple of 128 bits. To make the representation of the clauses as compact as possible which still making the decoding circuitry relatively simple, the instructions are packed so that two 128-bit quadwords can store 3 78-bit instructions, or 3 128-bit quadwords can store 4 instructions and a 60-bit constant. There were some bits left over, which seem to have been used to obviate the need to keep track of state between each word, simplifying the decoder and making it possible to decode the quadwords in parallel. Thus, the quadwords can be (almost) arbitrarily reordered while still Each format fully describes which instruction(s) in the decoded clause the bits in the quadword represent, and whether one of those instructions is the last instruction. Conceptually, each clause consists of a clause header, followed by one or more 78-bit instruction words and then zero or more 60-bit constants. (Constants are actually 64 bits, but they're loaded the same port as uniform registers and share the same field in the instruction word, which includes 7 bits to choose which uniform register to load, some of which would be unused for constants, so ARM decided to be clever and stick the bottom 4 bits in each instruction where the constant is loaded, so the actual constants in the instruction stream are only 60 bits). But the instruction fetching hardware only works in 128-bit quadwords, so each clause has to be a multiple of 128 bits. To make the representation of the clauses as compact as possible which still making the decoding circuitry relatively simple, the instructions are packed so that two 128-bit quadwords can store 3 78-bit instructions, or 3 128-bit quadwords can store 4 instructions and a 60-bit constant. There were some bits left over, which seem to have been used to obviate the need to keep track of state between each word, simplifying the decoder and making it possible to decode the quadwords in parallel. Thus, the quadwords can be (almost) arbitrarily reordered while still retaining the meaning of the clause. (It's unknown whether this works in practice, but theoretically it could be done.) Each format fully describes which instruction(s) in the decoded clause the bits in the quadword represent, and whether one of those instructions is the last instruction.
== Quadword formats == Quadword formats
The bottom 8 bits of each 128-bit quadword are a "tag" that's read by the decoding circuitry. They describe how to interpret the rest of the word, as well as possibly containing some bits from some of the instructions to decode if there were more bits to spare. Each possible tag value is described below. The bottom 8 bits of each 128-bit quadword are a "tag" that's read by the decoding circuitry. They describe how to interpret the rest of the word, as well as possibly containing some bits from some of the instructions to decode if there were more bits to spare. Each possible tag value is described below.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment