The Yelp dataset is too large to host on GitHub.
- Download, unzip, and untar from Yelp Open Dataset.
- Create a
data/folder inside your project directory: - Place the following JSON files into
yelp-loader/data/:
business.jsonreview.jsontip.jsonuser.json
You can do this by running this within the /Yelp JSON folder, but make sure your filepaths are correct:
mv yelp_academic_dataset_business.json ~/412-project/yelp-loader/data/business.jsonmv yelp_academic_dataset_user.json ~/412-project/yelp-loader/data/user.jsonmv yelp_academic_dataset_review.json ~/412-project/yelp-loader/data/review.jsonmv yelp_academic_dataset_tip.json ~/412-project/yelp-loader/data/tip.jsonMake sure Python 3 and pip are installed:
python3 --version
pip --versionapt install python3.12-venvensure you are in the project folder (412-project/yelp-loader) for below command:
python3 -m venv venvsource venv/bin/activateto leave:
deactivateonly do this after last command
pip install psycopg2-binary --break-system-packagesexport PATH=$PATH:/lib/postgresql/16/binexport PGPORT=8888export PGHOST=/tmpinitdb $HOME/dbProjectpg_ctl -D $HOME/dbProject -o '-k /tmp' startreplace USERNAME with your system username (bash: whoami)
make full