aarch64 neon support (cont'd)
This extends MR !37 (closed) and implements most of the remaining Aarch64 instructions. This is enough to accelerate e.g. videotestsrc, accumulator and flags2d are also implemented.
The videoconvert from I420 to RGB (useful e.g. for jpegdec and openh264dec) acceleration is not implemented due to missing loadupdb implementation. This will be added in separate MR.
Merge request reports
Activity
mentioned in merge request !37 (closed)
added 57 commits
-
a3b51418...8fc6bdf5 - 20 commits from branch
gstreamer:master
- 80c1defe - aarch64: make some setups for aarch64 support
- 7c64f89f - aarch64: implement emits for general instructions
- 9264fd76 - aarch64: implement emits for some vector instructions and ORC ops (add)
- ae4e4d6a - aarch64: orcprogram-neon porting to aarch64
- 4bd7309e - aarch64: Use 64bit operations on 64bit pointers
- de0b15dd - aarch64: Repair 8bit load/store opcode
- 7d1c4da3 - aarch64: Repair emit for imm 1
- 8b8dade9 - aarch64: Repair storeX instructions
- 4c83f87e - aarch64: Implement unary instruction emit
- 42a956a1 - aarch64: Implement convX instructions
- d17fcdd7 - aarch64: Implement select{0,1}X instructions
- 1fafc66a - aarch64: Implement mulhX instructions
- 1cdfaaba - aarch64: Implement mov instructions
- 5a6baa5f - aarch64: Implement shift instructions
- bf7603ff - aarch64: Implement loadX instructions
- 5345fc6f - aarch64: Clean up mergeX/splatX instructions
- e44e7e38 - aarch64: Implement mergeX instructions
- 4455a443 - aarch64: Implement copyX/orX instructions
- 3cc2d82e - aarch64: Implement xorX instructions
- f8c53ae3 - aarch64: Implement absX instructions
- 629e7ea2 - aarch64: Implement andX instructions
- a74f2c47 - aarch64: Implement subX instructions
- 65391500 - aarch64: Implement loadiX instructions
- 3d873fd0 - aarch64: Implement accX instructions
- b084cb8f - aarch64: Implement vminX/vmaxX instructions
- cd198fa6 - aarch64: Implement signX instructions
- 35c8f664 - aarch64: Implement splitX/splatX instructions
- 30f15617 - aarch64: Implement loadupdb instruction
- 2aa2e3f3 - aarch64: Implement avgX instructions
- 9b619424 - aarch64: Implement cmpX instructions
- 8d784b4d - aarch64: Implement mulX instructions
- 46f6e6a0 - aarch64: Implement div255w instruction
- 2c55d753 - aarch64: Implement swapX instructions
- 9f99280a - aarch64: Implement splatw3q instruction
- 8c39b126 - aarch64: Implement andn instruction
- ee2c7eaa - aarch64: Implement floating-point arithmetic instructions
- b93dd9ca - aarch64: Implement accumulator store
Toggle commit list-
a3b51418...8fc6bdf5 - 20 commits from branch
added 16 commits
- 2012ee8e - aarch64: Implement accX instructions
- 2b2ecc44 - aarch64: Implement vminX/vmaxX instructions
- 49cc5250 - aarch64: Implement signX instructions
- d6b987e5 - aarch64: Implement splitX/splatX instructions
- a00198bd - aarch64: Implement loadupdb instruction
- 6ab7e50a - aarch64: Implement avgX instructions
- ab81317b - aarch64: Implement cmpX instructions
- d2c9d4f2 - aarch64: Implement mulX instructions
- e68b5eac - aarch64: Implement div255w instruction
- 7a5a5e7e - aarch64: Implement swapX instructions
- 179973d3 - aarch64: Implement splatw3q instruction
- e17726ce - aarch64: Implement andn instruction
- c88cb606 - aarch64: Implement floating-point arithmetic instructions
- 19819c61 - aarch64: Implement accumulator store
- 43d57033 - aarch64: Implement const64 loadiq
- eb027f95 - aarch64: Implement flags2d
Toggle commit listadded 25 commits
- 37910403 - aarch64: Implement loadX instructions
- cd793476 - aarch64: Clean up mergeX/splatX instructions
- 8dd3da66 - aarch64: Implement mergeX instructions
- 4de4e86f - aarch64: Implement copyX/orX instructions
- 1583be7d - aarch64: Implement xorX instructions
- cc12b32f - aarch64: Implement absX instructions
- 8624e098 - aarch64: Implement andX instructions
- b81b455f - aarch64: Implement subX instructions
- c158b220 - aarch64: Implement loadiX instructions
- d2692517 - aarch64: Implement accX instructions
- a79dee7b - aarch64: Implement vminX/vmaxX instructions
- 934c3c07 - aarch64: Implement signX instructions
- 3d8a3baf - aarch64: Implement splitX/splatX instructions
- 622779b5 - aarch64: Implement loadupdb instruction
- 6987dcad - aarch64: Implement avgX instructions
- 00943762 - aarch64: Implement cmpX instructions
- ac30e990 - aarch64: Implement mulX instructions
- e4339b43 - aarch64: Implement div255w instruction
- 2b0b1a04 - aarch64: Implement swapX instructions
- b0c0092f - aarch64: Implement splatw3q instruction
- d4ba6b5b - aarch64: Implement andn instruction
- 8922c7c4 - aarch64: Implement floating-point arithmetic instructions
- bcd510e4 - aarch64: Implement accumulator store
- c53f6898 - aarch64: Implement const64 loadiq
- 09a333e9 - aarch64: Implement flags2d
Toggle commit listadded 32 commits
- d6dc365e - aarch64: Implement select{0,1}X instructions
- 3dd3d3e9 - aarch64: Implement mulhX instructions
- fa368b10 - aarch64: Implement mov instructions
- 64f5a009 - aarch64: Implement shift instructions
- c842588f - aarch64: Implement loadX instructions
- e92f62be - aarch64: Clean up mergeX/splatX instructions
- c3fd0b27 - aarch64: Implement mergeX instructions
- 4b12ee1e - aarch64: Implement copyX/orX instructions
- be11f5a8 - aarch64: Implement xorX instructions
- 084ad6f7 - aarch64: Implement absX instructions
- 40277eb7 - aarch64: Implement andX instructions
- f6145dc8 - aarch64: Implement subX instructions
- ef67e36a - aarch64: Implement loadiX instructions
- 26651ffc - aarch64: Implement accX instructions
- f5b0e20e - aarch64: Implement vminX/vmaxX instructions
- 8b6682b2 - aarch64: Implement signX instructions
- 82318e2a - aarch64: Implement splitX/splatX instructions
- 01677c5a - aarch64: Implement loadupdb instruction
- b4a7374f - aarch64: Implement avgX instructions
- 40670229 - aarch64: Implement cmpX instructions
- 2818bf91 - aarch64: Implement mulX instructions
- 30acad7d - aarch64: Implement div255w instruction
- 1e3a7765 - aarch64: Implement swapX instructions
- 2773f9f3 - aarch64: Implement splatw3q instruction
- c8eed996 - aarch64: Implement andn instruction
- 29105437 - aarch64: Implement floating-point arithmetic instructions
- 686d1684 - aarch64: Implement accumulator store
- e184d5b0 - aarch64: Implement const64 loadiq
- fd3a31e0 - aarch64: Implement flags2d
- b0c6bcfb - aarch64: Implement double-precision floating-point arithmetic instructions
- c412702b - aarch64: Implement divf instruction
- 5bacf34e - aarch64: Implement sqrtf instruction
Toggle commit listadded 42 commits
- 6ad01fcf - aarch64: make some setups for aarch64 support
- 788c909b - aarch64: implement emits for general instructions
- ee91c27b - aarch64: implement emits for some vector instructions and ORC ops (add)
- f8aaecb5 - aarch64: orcprogram-neon porting to aarch64
- 55a2b65c - aarch64: Use 64bit operations on 64bit pointers
- 68236d84 - aarch64: Repair 8bit load/store opcode
- cc15cb84 - aarch64: Repair emit for imm 1
- a0df3bb0 - aarch64: Repair storeX instructions
- 0e9f3693 - aarch64: Implement unary instruction emit
- ae90f1e4 - aarch64: Implement convX instructions
- f130fed6 - aarch64: Implement select{0,1}X instructions
- 7f232ef6 - aarch64: Implement mulhX instructions
- 80992353 - aarch64: Implement mov instructions
- 6b61afbc - aarch64: Implement shift instructions
- 527e340e - aarch64: Implement loadX instructions
- 78a960dc - aarch64: Clean up mergeX/splatX instructions
- 389e9eb9 - aarch64: Implement mergeX instructions
- 704b3683 - aarch64: Implement copyX/orX instructions
- fb46152c - aarch64: Implement xorX instructions
- e22be081 - aarch64: Implement absX instructions
- cfc7c896 - aarch64: Implement andX instructions
- d9dccb9b - aarch64: Implement subX instructions
- 7a66b215 - aarch64: Implement loadiX instructions
- 7fabaa7d - aarch64: Implement accX instructions
- 8671f23c - aarch64: Implement vminX/vmaxX instructions
- f7031940 - aarch64: Implement signX instructions
- b22fa663 - aarch64: Implement splitX/splatX instructions
- 860411c3 - aarch64: Implement loadupdb instruction
- b476c2cc - aarch64: Implement avgX instructions
- 4f1ff670 - aarch64: Implement cmpX instructions
- da181a13 - aarch64: Implement mulX instructions
- 4737af57 - aarch64: Implement div255w instruction
- 5ee54c3e - aarch64: Implement swapX instructions
- e6a41e25 - aarch64: Implement splatw3q instruction
- 6b96b59f - aarch64: Implement andn instruction
- b8e65adb - aarch64: Implement floating-point arithmetic instructions
- e0408513 - aarch64: Implement accumulator store
- d84b1687 - aarch64: Implement const64 loadiq
- 237746ae - aarch64: Implement flags2d
- f4f34d58 - aarch64: Implement double-precision floating-point arithmetic instructions
- d4755780 - aarch64: Implement divf instruction
- fc42f9c5 - aarch64: Implement sqrtf instruction
Toggle commit listadded 26 commits
- edbd8a0f - aarch64: Implement mergeX instructions
- b9d47aa0 - aarch64: Implement copyX/orX instructions
- f2521eff - aarch64: Implement xorX instructions
- ebf84090 - aarch64: Implement absX instructions
- dad18cc9 - aarch64: Implement andX instructions
- 52a27376 - aarch64: Implement subX instructions
- c148e5d5 - aarch64: Implement loadiX instructions
- 34ecc949 - aarch64: Implement accX instructions
- f2fdf4bc - aarch64: Implement vminX/vmaxX instructions
- 477cd20c - aarch64: Implement signX instructions
- 7dd9d4d3 - aarch64: Implement splitX/splatX instructions
- 925cbe0a - aarch64: Implement loadupdb instruction
- 3b240459 - aarch64: Implement avgX instructions
- 0f070cdd - aarch64: Implement cmpX instructions
- f4575d48 - aarch64: Implement mulX instructions
- 65d73339 - aarch64: Implement div255w instruction
- 2c9545fe - aarch64: Implement swapX instructions
- 1d57567f - aarch64: Implement splatw3q instruction
- d3710200 - aarch64: Implement andn instruction
- f4553ccd - aarch64: Implement floating-point arithmetic instructions
- 6bbd74d8 - aarch64: Implement accumulator store
- 952bce19 - aarch64: Implement const64 loadiq
- c9166faa - aarch64: Implement flags2d
- 56c122a9 - aarch64: Implement double-precision floating-point arithmetic instructions
- 18803a0c - aarch64: Implement divf instruction
- 48dbc63d - aarch64: Implement sqrtf instruction
Toggle commit listadded 35 commits
- 164bcb6f - aarch64: Fix MSVC warnings
- ccf12ae3 - aarch64: Implement unary instruction emit
- 5503b7e0 - aarch64: Implement convX instructions
- ae6de7ac - aarch64: Implement select{0,1}X instructions
- 9cd4030e - aarch64: Implement mulhX instructions
- 1f55fc25 - aarch64: Implement mov instructions
- 58900f71 - aarch64: Implement shift instructions
- 5820f57a - aarch64: Implement loadX instructions
- da85036e - aarch64: Clean up mergeX/splatX instructions
- 255b0ea2 - aarch64: Implement mergeX instructions
- 069cab4a - aarch64: Implement copyX/orX instructions
- 10a43495 - aarch64: Implement xorX instructions
- 5aba5c42 - aarch64: Implement absX instructions
- 93eab062 - aarch64: Implement andX instructions
- 00bf8149 - aarch64: Implement subX instructions
- 3de19796 - aarch64: Implement loadiX instructions
- cde478dc - aarch64: Implement accX instructions
- a8dfd255 - aarch64: Implement vminX/vmaxX instructions
- 7525fefd - aarch64: Implement signX instructions
- 85c14b67 - aarch64: Implement splitX/splatX instructions
- 8dc2714b - aarch64: Implement loadupdb instruction
- b74936d7 - aarch64: Implement avgX instructions
- 774f2b04 - aarch64: Implement cmpX instructions
- bfb0b9a0 - aarch64: Implement mulX instructions
- 7de2087e - aarch64: Implement div255w instruction
- c6d25ab9 - aarch64: Implement swapX instructions
- 8a0696c3 - aarch64: Implement splatw3q instruction
- d38ef503 - aarch64: Implement andn instruction
- 1828d761 - aarch64: Implement floating-point arithmetic instructions
- 588bb481 - aarch64: Implement accumulator store
- d561250b - aarch64: Implement const64 loadiq
- 65b648f0 - aarch64: Implement flags2d
- 9751e1fc - aarch64: Implement double-precision floating-point arithmetic instructions
- cb33d35b - aarch64: Implement divf instruction
- ba88b06f - aarch64: Implement sqrtf instruction
Toggle commit listadded 16 commits
- 31774aca - aarch64: Implement loadupdb instruction
- 46ae28fa - aarch32: Implement loadupdb instruction
- 651978f6 - aarch64: Implement avgX instructions
- f490214e - aarch64: Implement cmpX instructions
- 61d84d8f - aarch64: Implement mulX instructions
- 27996e36 - aarch64: Implement div255w instruction
- 97397db5 - aarch64: Implement swapX instructions
- 16449550 - aarch64: Implement splatw3q instruction
- efbd5a64 - aarch64: Implement andn instruction
- ff490170 - aarch64: Implement floating-point arithmetic instructions
- 732bb307 - aarch64: Implement accumulator store
- 76ac0041 - aarch64: Implement const64 loadiq
- eaea8117 - aarch64: Implement flags2d
- 84b6a1d7 - aarch64: Implement double-precision floating-point arithmetic instructions
- dc2d0945 - aarch64: Implement divf instruction
- b1a17ab0 - aarch64: Implement sqrtf instruction
Toggle commit listadded 15 commits
- b0e1a59a - aarch32: Implement loadupdb instruction
- b701af88 - aarch64: Implement avgX instructions
- d8e7a23b - aarch64: Implement cmpX instructions
- 78cef540 - aarch64: Implement mulX instructions
- 5c282df5 - aarch64: Implement div255w instruction
- 47c4c819 - aarch64: Implement swapX instructions
- edf6a010 - aarch64: Implement splatw3q instruction
- bc51fc64 - aarch64: Implement andn instruction
- 20a716da - aarch64: Implement floating-point arithmetic instructions
- 36229f79 - aarch64: Implement accumulator store
- d215ba96 - aarch64: Implement const64 loadiq
- 9a228d60 - aarch64: Implement flags2d
- 7be398bc - aarch64: Implement double-precision floating-point arithmetic instructions
- ed2be3a6 - aarch64: Implement divf instruction
- 63f3a5dd - aarch64: Implement sqrtf instruction
Toggle commit list