The https://docs.rs/flexpolyline/0.1.0/flexpolyline/ crate runs the bench_encode benchmark in around 19 µs 330 ns on my machine, vs 77 µs for our decode function. That's in the 3x ballpark. They seem to use a lot of lookups, which I imagine accounts for the difference, but it would be good to investigate.