bytereader: SIMD-based optimization to find start code on H.264 and H.265 streams
Submitted by Sungho Bae
Created attachment 277253
In the parse phase for video streams, bytescanning is performed to find the start and end of each NAL unit. This implementation is to improve the performance of bytescanning for start code using both ARM NEON instructions and pointer access instead of indexed access.
The advantages are to reduce CPU usage and to enhance the scanning performance.
This patch assumes that '0x0000' is unlikely to appear and the zeros are part of the start code, that is, '0x010000'.
Our proposed idea is based on the assumption.
If we quickly know whether or not there exists '0x0000' in the scanning area to find the start code, we can skip the scanning process for the start code.
We thus implemented the preprocessing to know whether or not there exists '0x0000'.
Because the probability of zeros to appear is low, its performance dramatically improved.
Patch 277253, "001_gstbytereader_neon_improvement_patch":