🥗 Cold-Start Recipe Recommendation via Template-Based Language Inference with LLM-Generated Summaries
Official Implementation of the paper submitted to Expert Systems with Applications (ESWA).
Recipe-sharing platforms face the Strict Cold-Start (SCS) problem: newly uploaded recipes lack user interactions (ratings, clicks), making them invisible to traditional algorithms (CF, GNNs) that rely on interaction history.
In this work, we propose a novel framework that bridges Large Language Models (LLMs) and Pre-trained Language Models (PLMs) to solve this. Instead of relying on collaborative filtering, we reformulate recommendation as a Sentence Pair Classification (SPC) task.
By utilizing GPT-4o-mini to distill complex user and recipe attributes into coherent natural language summaries, we enable a BERT-based model to predict compatibility via Natural Language Inference (NLI). Our approach achieves state-of-the-art performance on SCS-Food.com and SCS-AllRecipes.com benchmarks.
Ensure your directory is organized as follows:
coldreciperec/
├── data/ # Main data directory
│ ├── foodcom/ # Food.com dataset files
│ │ ├── metadata.pkl
│ │ ├── user_descriptions_RAW_ID.pkl
│ │ ├── RAW_recipes.csv # From Kaggle
│ │ └── ...
│ └── allrecipe/ # AllRecipes dataset files
│ ├── metadata.pkl
│ ├── raw-data_recipe.csv # From Kaggle
│ └── ...
├── make_train_data.py # Generates training/validation pairs
├── make_test_data.py # Generates ranking test data
├── finetune.py # Fine-tunes the BERT model
├── evaluate.py # Calculates NDCG & Recall metrics
└── README.md
Before running the scripts, please download the required datasets from the sources below.
| Dataset | Source | Description |
|---|---|---|
| Summary Data (Required) | Download via Figshare | Contains the foodcom and allrecipe summary folders. |
| Food.com Raw Data | Download via Kaggle | Original recipe attributes and interaction data. |
| AllRecipes Raw Data | Download via Kaggle | Original recipe attributes and interaction data. |
After downloading the Summary Data from Figshare, extract the contents and move the foodcom and allrecipe folders into your project directory here:
coldreciperec/data/
Create the training and validation datasets. This script handles negative sampling and template injection:
For Food.com
python make_train_data.py --dataset foodcomFor AllRecipes
python make_train_data.py --dataset allrecipeCreate the inference dataset. This generates a ranking list (User x All Items) formatted into natural language prompts.
python make_test_data.py --dataset foodcom Output: Saves scp_test_data_{dataset}.pkl.
Fine-tune the PLM (default: bert-base-uncased) on the sentence pairs.
python finetune.py \
--dataset foodcom Outputs: Models are saved to ./data/{dataset}/models/.
Evaluate the model using NDCG@K and Recall@K .
python evaluate.py \
--dataset foodcom