Score user pose vs reference - [ ] `compare.py` calculates the difference between reference image and the provided image - [ ] Accept external threshold dict - [ ] Generate "score" - need to define that - [ ] Wire in the `--eval` argument