I'm pretty sure the difference between 2 and 3 is because newer versions use newer Unicode tables. For example, running just Python 2.7 updates the table length from 5143 to 5516 (UCD 5.1.0). With Python 3.6, the table is 5722 entries (UCD 9). Python 3.7 has UCD 11, so probably there would be even more entries, but I did not try it out.
poppler/UnicodeDecompTables.h be updated to take advantage of the newer tables?