Needs to be turned into a __main__ so one has to provide a path where to find the data rather than hard coding it in the script