aarch64 neon support (cont'd)

mentioned in merge request !37 (closed)

changed the description

Both accumulator and 2d operations are still missing.

Does it gracefully fall back to the backup code if any of those are used? That is, can this be merged already to improve the situation or would it cause failures with the unsupported operations?

added 57 commits

a3b51418...8fc6bdf5 - 20 commits from branch gstreamer:master
80c1defe - aarch64: make some setups for aarch64 support
7c64f89f - aarch64: implement emits for general instructions
9264fd76 - aarch64: implement emits for some vector instructions and ORC ops (add)
ae4e4d6a - aarch64: orcprogram-neon porting to aarch64
4bd7309e - aarch64: Use 64bit operations on 64bit pointers
de0b15dd - aarch64: Repair 8bit load/store opcode
7d1c4da3 - aarch64: Repair emit for imm 1
8b8dade9 - aarch64: Repair storeX instructions
4c83f87e - aarch64: Implement unary instruction emit
42a956a1 - aarch64: Implement convX instructions
d17fcdd7 - aarch64: Implement select{0,1}X instructions
1fafc66a - aarch64: Implement mulhX instructions
1cdfaaba - aarch64: Implement mov instructions
5a6baa5f - aarch64: Implement shift instructions
bf7603ff - aarch64: Implement loadX instructions
5345fc6f - aarch64: Clean up mergeX/splatX instructions
e44e7e38 - aarch64: Implement mergeX instructions
4455a443 - aarch64: Implement copyX/orX instructions
3cc2d82e - aarch64: Implement xorX instructions
f8c53ae3 - aarch64: Implement absX instructions
629e7ea2 - aarch64: Implement andX instructions
a74f2c47 - aarch64: Implement subX instructions
65391500 - aarch64: Implement loadiX instructions
3d873fd0 - aarch64: Implement accX instructions
b084cb8f - aarch64: Implement vminX/vmaxX instructions
cd198fa6 - aarch64: Implement signX instructions
35c8f664 - aarch64: Implement splitX/splatX instructions
30f15617 - aarch64: Implement loadupdb instruction
2aa2e3f3 - aarch64: Implement avgX instructions
9b619424 - aarch64: Implement cmpX instructions
8d784b4d - aarch64: Implement mulX instructions
46f6e6a0 - aarch64: Implement div255w instruction
2c55d753 - aarch64: Implement swapX instructions
9f99280a - aarch64: Implement splatw3q instruction
8c39b126 - aarch64: Implement andn instruction
ee2c7eaa - aarch64: Implement floating-point arithmetic instructions
b93dd9ca - aarch64: Implement accumulator store

Compare with previous version

added 1 commit

b1f3be50 - aarch64: Implement const64 loadiq

Compare with previous version

added 16 commits

2012ee8e - aarch64: Implement accX instructions
2b2ecc44 - aarch64: Implement vminX/vmaxX instructions
49cc5250 - aarch64: Implement signX instructions
d6b987e5 - aarch64: Implement splitX/splatX instructions
a00198bd - aarch64: Implement loadupdb instruction
6ab7e50a - aarch64: Implement avgX instructions
ab81317b - aarch64: Implement cmpX instructions
d2c9d4f2 - aarch64: Implement mulX instructions
e68b5eac - aarch64: Implement div255w instruction
7a5a5e7e - aarch64: Implement swapX instructions
179973d3 - aarch64: Implement splatw3q instruction
e17726ce - aarch64: Implement andn instruction
c88cb606 - aarch64: Implement floating-point arithmetic instructions
19819c61 - aarch64: Implement accumulator store
43d57033 - aarch64: Implement const64 loadiq
eb027f95 - aarch64: Implement flags2d

Compare with previous version

added 25 commits

37910403 - aarch64: Implement loadX instructions
cd793476 - aarch64: Clean up mergeX/splatX instructions
8dd3da66 - aarch64: Implement mergeX instructions
4de4e86f - aarch64: Implement copyX/orX instructions
1583be7d - aarch64: Implement xorX instructions
cc12b32f - aarch64: Implement absX instructions
8624e098 - aarch64: Implement andX instructions
b81b455f - aarch64: Implement subX instructions
c158b220 - aarch64: Implement loadiX instructions
d2692517 - aarch64: Implement accX instructions
a79dee7b - aarch64: Implement vminX/vmaxX instructions
934c3c07 - aarch64: Implement signX instructions
3d8a3baf - aarch64: Implement splitX/splatX instructions
622779b5 - aarch64: Implement loadupdb instruction
6987dcad - aarch64: Implement avgX instructions
00943762 - aarch64: Implement cmpX instructions
ac30e990 - aarch64: Implement mulX instructions
e4339b43 - aarch64: Implement div255w instruction
2b0b1a04 - aarch64: Implement swapX instructions
b0c0092f - aarch64: Implement splatw3q instruction
d4ba6b5b - aarch64: Implement andn instruction
8922c7c4 - aarch64: Implement floating-point arithmetic instructions
bcd510e4 - aarch64: Implement accumulator store
c53f6898 - aarch64: Implement const64 loadiq
09a333e9 - aarch64: Implement flags2d

Compare with previous version

added 3 commits

d89699c6 - aarch64: Implement double-precision floating-point arithmetic instructions
22109801 - aarch64: Implement divf instruction
e82bd525 - aarch64: Implement sqrtf instruction

Compare with previous version

added 3 commits

53fc0a5c - aarch64: Implement double-precision floating-point arithmetic instructions
1d0bfd62 - aarch64: Implement divf instruction
2759a325 - aarch64: Implement sqrtf instruction

Compare with previous version

added 32 commits

d6dc365e - aarch64: Implement select{0,1}X instructions
3dd3d3e9 - aarch64: Implement mulhX instructions
fa368b10 - aarch64: Implement mov instructions
64f5a009 - aarch64: Implement shift instructions
c842588f - aarch64: Implement loadX instructions
e92f62be - aarch64: Clean up mergeX/splatX instructions
c3fd0b27 - aarch64: Implement mergeX instructions
4b12ee1e - aarch64: Implement copyX/orX instructions
be11f5a8 - aarch64: Implement xorX instructions
084ad6f7 - aarch64: Implement absX instructions
40277eb7 - aarch64: Implement andX instructions
f6145dc8 - aarch64: Implement subX instructions
ef67e36a - aarch64: Implement loadiX instructions
26651ffc - aarch64: Implement accX instructions
f5b0e20e - aarch64: Implement vminX/vmaxX instructions
8b6682b2 - aarch64: Implement signX instructions
82318e2a - aarch64: Implement splitX/splatX instructions
01677c5a - aarch64: Implement loadupdb instruction
b4a7374f - aarch64: Implement avgX instructions
40670229 - aarch64: Implement cmpX instructions
2818bf91 - aarch64: Implement mulX instructions
30acad7d - aarch64: Implement div255w instruction
1e3a7765 - aarch64: Implement swapX instructions
2773f9f3 - aarch64: Implement splatw3q instruction
c8eed996 - aarch64: Implement andn instruction
29105437 - aarch64: Implement floating-point arithmetic instructions
686d1684 - aarch64: Implement accumulator store
e184d5b0 - aarch64: Implement const64 loadiq
fd3a31e0 - aarch64: Implement flags2d
b0c6bcfb - aarch64: Implement double-precision floating-point arithmetic instructions
c412702b - aarch64: Implement divf instruction
5bacf34e - aarch64: Implement sqrtf instruction

Compare with previous version

added 42 commits

6ad01fcf - aarch64: make some setups for aarch64 support
788c909b - aarch64: implement emits for general instructions
ee91c27b - aarch64: implement emits for some vector instructions and ORC ops (add)
f8aaecb5 - aarch64: orcprogram-neon porting to aarch64
55a2b65c - aarch64: Use 64bit operations on 64bit pointers
68236d84 - aarch64: Repair 8bit load/store opcode
cc15cb84 - aarch64: Repair emit for imm 1
a0df3bb0 - aarch64: Repair storeX instructions
0e9f3693 - aarch64: Implement unary instruction emit
ae90f1e4 - aarch64: Implement convX instructions
f130fed6 - aarch64: Implement select{0,1}X instructions
7f232ef6 - aarch64: Implement mulhX instructions
80992353 - aarch64: Implement mov instructions
6b61afbc - aarch64: Implement shift instructions
527e340e - aarch64: Implement loadX instructions
78a960dc - aarch64: Clean up mergeX/splatX instructions
389e9eb9 - aarch64: Implement mergeX instructions
704b3683 - aarch64: Implement copyX/orX instructions
fb46152c - aarch64: Implement xorX instructions
e22be081 - aarch64: Implement absX instructions
cfc7c896 - aarch64: Implement andX instructions
d9dccb9b - aarch64: Implement subX instructions
7a66b215 - aarch64: Implement loadiX instructions
7fabaa7d - aarch64: Implement accX instructions
8671f23c - aarch64: Implement vminX/vmaxX instructions
f7031940 - aarch64: Implement signX instructions
b22fa663 - aarch64: Implement splitX/splatX instructions
860411c3 - aarch64: Implement loadupdb instruction
b476c2cc - aarch64: Implement avgX instructions
4f1ff670 - aarch64: Implement cmpX instructions
da181a13 - aarch64: Implement mulX instructions
4737af57 - aarch64: Implement div255w instruction
5ee54c3e - aarch64: Implement swapX instructions
e6a41e25 - aarch64: Implement splatw3q instruction
6b96b59f - aarch64: Implement andn instruction
b8e65adb - aarch64: Implement floating-point arithmetic instructions
e0408513 - aarch64: Implement accumulator store
d84b1687 - aarch64: Implement const64 loadiq
237746ae - aarch64: Implement flags2d
f4f34d58 - aarch64: Implement double-precision floating-point arithmetic instructions
d4755780 - aarch64: Implement divf instruction
fc42f9c5 - aarch64: Implement sqrtf instruction

Compare with previous version

changed the description

The aarch32 and aarch64 both pass the ORC tests, and whatever other ORC code I could find and test. There are compile-failures on ldres{lin,near}{l,b} opcodes, as those are not implemented, but they were missing before too. I think this can be merged now.

added 26 commits

edbd8a0f - aarch64: Implement mergeX instructions
b9d47aa0 - aarch64: Implement copyX/orX instructions
f2521eff - aarch64: Implement xorX instructions
ebf84090 - aarch64: Implement absX instructions
dad18cc9 - aarch64: Implement andX instructions
52a27376 - aarch64: Implement subX instructions
c148e5d5 - aarch64: Implement loadiX instructions
34ecc949 - aarch64: Implement accX instructions
f2fdf4bc - aarch64: Implement vminX/vmaxX instructions
477cd20c - aarch64: Implement signX instructions
7dd9d4d3 - aarch64: Implement splitX/splatX instructions
925cbe0a - aarch64: Implement loadupdb instruction
3b240459 - aarch64: Implement avgX instructions
0f070cdd - aarch64: Implement cmpX instructions
f4575d48 - aarch64: Implement mulX instructions
65d73339 - aarch64: Implement div255w instruction
2c9545fe - aarch64: Implement swapX instructions
1d57567f - aarch64: Implement splatw3q instruction
d3710200 - aarch64: Implement andn instruction
f4553ccd - aarch64: Implement floating-point arithmetic instructions
6bbd74d8 - aarch64: Implement accumulator store
952bce19 - aarch64: Implement const64 loadiq
c9166faa - aarch64: Implement flags2d
56c122a9 - aarch64: Implement double-precision floating-point arithmetic instructions
18803a0c - aarch64: Implement divf instruction
48dbc63d - aarch64: Implement sqrtf instruction

Compare with previous version

added 1 commit

443eddbe - aarch64: Fix MSVC warnings

Compare with previous version

added 1 commit

4c1c2fe1 - aarch64: Fix MSVC warnings

Compare with previous version

added 35 commits

164bcb6f - aarch64: Fix MSVC warnings
ccf12ae3 - aarch64: Implement unary instruction emit
5503b7e0 - aarch64: Implement convX instructions
ae6de7ac - aarch64: Implement select{0,1}X instructions
9cd4030e - aarch64: Implement mulhX instructions
1f55fc25 - aarch64: Implement mov instructions
58900f71 - aarch64: Implement shift instructions
5820f57a - aarch64: Implement loadX instructions
da85036e - aarch64: Clean up mergeX/splatX instructions
255b0ea2 - aarch64: Implement mergeX instructions
069cab4a - aarch64: Implement copyX/orX instructions
10a43495 - aarch64: Implement xorX instructions
5aba5c42 - aarch64: Implement absX instructions
93eab062 - aarch64: Implement andX instructions
00bf8149 - aarch64: Implement subX instructions
3de19796 - aarch64: Implement loadiX instructions
cde478dc - aarch64: Implement accX instructions
a8dfd255 - aarch64: Implement vminX/vmaxX instructions
7525fefd - aarch64: Implement signX instructions
85c14b67 - aarch64: Implement splitX/splatX instructions
8dc2714b - aarch64: Implement loadupdb instruction
b74936d7 - aarch64: Implement avgX instructions
774f2b04 - aarch64: Implement cmpX instructions
bfb0b9a0 - aarch64: Implement mulX instructions
7de2087e - aarch64: Implement div255w instruction
c6d25ab9 - aarch64: Implement swapX instructions
8a0696c3 - aarch64: Implement splatw3q instruction
d38ef503 - aarch64: Implement andn instruction
1828d761 - aarch64: Implement floating-point arithmetic instructions
588bb481 - aarch64: Implement accumulator store
d561250b - aarch64: Implement const64 loadiq
65b648f0 - aarch64: Implement flags2d
9751e1fc - aarch64: Implement double-precision floating-point arithmetic instructions
cb33d35b - aarch64: Implement divf instruction
ba88b06f - aarch64: Implement sqrtf instruction

Compare with previous version

added 16 commits

31774aca - aarch64: Implement loadupdb instruction
46ae28fa - aarch32: Implement loadupdb instruction
651978f6 - aarch64: Implement avgX instructions
f490214e - aarch64: Implement cmpX instructions
61d84d8f - aarch64: Implement mulX instructions
27996e36 - aarch64: Implement div255w instruction
97397db5 - aarch64: Implement swapX instructions
16449550 - aarch64: Implement splatw3q instruction
efbd5a64 - aarch64: Implement andn instruction
ff490170 - aarch64: Implement floating-point arithmetic instructions
732bb307 - aarch64: Implement accumulator store
76ac0041 - aarch64: Implement const64 loadiq
eaea8117 - aarch64: Implement flags2d
84b6a1d7 - aarch64: Implement double-precision floating-point arithmetic instructions
dc2d0945 - aarch64: Implement divf instruction
b1a17ab0 - aarch64: Implement sqrtf instruction

Compare with previous version

added 15 commits

b0e1a59a - aarch32: Implement loadupdb instruction
b701af88 - aarch64: Implement avgX instructions
d8e7a23b - aarch64: Implement cmpX instructions
78cef540 - aarch64: Implement mulX instructions
5c282df5 - aarch64: Implement div255w instruction
47c4c819 - aarch64: Implement swapX instructions
edf6a010 - aarch64: Implement splatw3q instruction
bc51fc64 - aarch64: Implement andn instruction
20a716da - aarch64: Implement floating-point arithmetic instructions
36229f79 - aarch64: Implement accumulator store
d215ba96 - aarch64: Implement const64 loadiq
9a228d60 - aarch64: Implement flags2d
7be398bc - aarch64: Implement double-precision floating-point arithmetic instructions
ed2be3a6 - aarch64: Implement divf instruction
63f3a5dd - aarch64: Implement sqrtf instruction

Compare with previous version

unmarked as a Work In Progress

Admin message

Admin message

aarch64 neon support (cont'd)

Merge request reports

Activity