When I ran with verbose mode I noticed that we're hitting the hugging face API every time we load the models for embedding; that seems like something we might want to download and cache rather hitting the API every time. (Later refinement, just noting.)
Originally posted by @rlskoeser in #281 (comment)