Reduce TextWord space and allocation overhead, improve TextBlock::coalesce algorithmic complexity
Currently, the word characters are allocated as a struct of arrays, e.g. text and charcode are allocated separately.
This causes some space (6 pointers, 6 malloc chunk management words (size_t/flags), alignment, ...) and runtime overhead (6 allocs/ frees per word).
Changing this to an array of struct reduces this overhead, sometimes significantly.
Also improve the worst case complexity of TextBlock::coalesce
, from
O(N^3) to O(N^2).
This addresses two parts of #1173 (closed). With this changes, most time for the problematic document is spent in TextBlock::addWord. The latter is addressed in !1515.