@@ -181,7 +181,7 @@ There is a trivial limit for the number of constants per clause: since each inst

== Algorithm for packing clauses

This section describes how the blob compiler seems to use these formats to pack as many instructions and constants into as few words as possible. There may be other equivalent ways to do it, but given how complicated all the different formats are, and that the hardware decoder and encoding algorithm were developed in tandem, it's probably best to stick with what the blob does.

This section describes how the blob compiler seems to use these formats to pack as many instructions and constants into as few words as possible. There may be other equivalent ways to do it, but given how complicated all the different formats are, and that the hardware decoder and encoding algorithm were most likely developed in tandem, it's probably best to stick with what the blob does.

First, we assign instructions to quadwords. We may assign an entire instruction to a quadword, or we may split an instruction across two quadwords.

...

...

@@ -192,16 +192,16 @@ First, we assign instructions to quadwords. We may assign an entire instruction

- Assign the fifth instruction to the fourth quadword.

- Split the sixth instruction across the fourth and fifth quadword.

- Assign the seventh instruction to the fifth quadword.

- Split the eighth instruction across the fifth and sixth quadwords.

- Assign the eighth instruction to the sixth quadword.

Simply go down the list until there are no more instructions left.

Now, we assign constants to quadwords if we have any. We do this by looking at the last quadword, and do the following:

Now, we assign constants to quadwords if we have any. If there were at least three instructions, then we can add one 64-bit constant for free in the leftover bits. We do this by looking at the last quadword, and do the following:

- If it only contains an instruction that was split across two quadwords, then there are 75 bits free. Put the constant where the next instruction would have gone, and use the appropriate format to indicate that.

- If it only contains one instruction, and the previous quadword has an instruction and a split instruction, then we can split the constant across the last two instructions.

- If it only contains one instruction, and the previous quadword has an instruction and a split instruction, then we can split the constant across the last two quadwords.

For any remaining constants, we simply add quadwords with two constants each. Note that in some cases, we need to add a "dummy" constant, even when the clause doesn't use any constants, because there's no format that does what we want. For example, say that we have a clause with 5 instructions and no constants. The fourth quadword is supposed to contain only instruction 4, which is the final instruction, but there is no format for that. Instead, we add a constant split across the third and fourth quadwords, since there is a format with a final instruction 4 and part of a constant. From a design point of view, this reduces the number of possible formats, which reduces the complexity of the decoder and means less bits are needed to describe the format.

For any remaining constants, we simply add quadwords with two constants each, padding with an extra 0 constant if necessary. Note that in some cases, we need to add a "dummy" constant, even when the clause doesn't use any constants, because there's no format that does what we want. For example, say that we have a clause with 5 instructions and no constants. The fourth quadword is supposed to contain only instruction 4, which is the final instruction, but there is no format for that. Instead, we add a constant split across the third and fourth quadwords, since there is a format with a final instruction 4 and part of a constant. In general, we need to do this whenever a constant could be inserted "for free", i.e. whenever there are at least three instructions. From a design point of view, this reduces the number of possible formats, which reduces the complexity of the decoder and means less bits are needed to describe the format.