Skip to content

Reduce TextWord space and allocation overhead, improve TextBlock::coalesce algorithmic complexity

Currently, the word characters are allocated as a struct of arrays, e.g. text and charcode are allocated separately.

This causes some space (6 pointers, 6 malloc chunk management words (size_t/flags), alignment, ...) and runtime overhead (6 allocs/ frees per word).

Changing this to an array of struct reduces this overhead, sometimes significantly.

Also improve the worst case complexity of TextBlock::coalesce, from O(N^3) to O(N^2).

This addresses two parts of #1173 (closed). With this changes, most time for the problematic document is spent in TextBlock::addWord. The latter is addressed in !1515.

Edited by StefanBruens

Merge request reports