Skip to content

Latest commit

 

History

History
19 lines (18 loc) · 1.29 KB

TODO.md

File metadata and controls

19 lines (18 loc) · 1.29 KB

2021

  • Speed up UTF-8 decoder (using pre-computed tables/SIMD)
  • Speed up hz_segment_t structure, try plain buffer rather than a doubly linked list.
  • Write a proper pool allocator for internal allocations.
  • Cache calls to hz_shape
  • Add loading from UTF-16, Windows-1252 (CP1252), etc...
  • Add support for Johab, UCS-2, UCS-4, UTF-16, UTF-32, GB-18030. etc...
  • Write SSE4, AVX-2 and AVX-512 optimizations for UTF-8 decoding and codepoint mapping.
  • Write reliable fast endianness check function, possibly look into compile time endianness checks.
  • Write fast unpack array functions which take into account byte swapping if necessary (SIMD).

6/3/2022

  • Look for a different hash function for hz_map_t such as MurmurHash3 or MeowHash, make it resizable, make it generic and improve performance.
  • Lower memory footprint of hz_range_list_t and improve its initialization speed.