World's First Diffusion Model based Visual Language Model for Pathology based on LaViDa, trained on PathGen-1.6M dataset and finetuned on PathGen-Instruct datasets.
Download GDC client. Download the required WSI using download_wsi_using_gdc_client.sh.
Download the PathGen-1.6M.json which has wsi id, position and captions, once you have the WSIs, use create_img_txt_pairs_for_pathgen.py to create image-text pairs.
Download the VQA dataset from jamessyx/PathGen-Instruct.
You can directly download the dataset for Stage 1 and Stage 2 from here in the format required for training.
Download checkpoint from https://huggingface.co/himanshunitrr/LaViDa-Pathgen You can infer using predict.py
git clone https://github.com/Himanshunitrr/LaViDa-PathGen.git
cd LaViDa
conda create --name lavida python=3.13
conda activate lavida
pip install -e .[train]
cd eval
pip install -e .
cd ../
pip install trl==0.17.0
IMG_PATH is the path to the images DATA_PATH is the path to the stage-1 dataset (json file)
You can view the wandb.ai log for this stage at this link
LaViDa-PathGen/LaViDa/scripts/train/exps/cluster/pretrain_llada.sh
For Stage 2 finetuning, you will need mm_projector.bin which you will get from Stage 1 training. If you just want to do Stage 2 finetuning, you can download the mm_projector.bin from here.
IMG_PATH is the path to the images DATA_PATH is the path to the stage-2 dataset (json file)
You can view the wandb.ai log for this stage at this link
LaViDa-PathGen/LaViDa/scripts/train/exps/cluster/llada-hd-llada-s2.sh
PathMMU
To evaluate the model on PathMMU use main.py
Use the conda environment you created earlier for LLaVA for evaluating LLaVA based models and use the conda environment you created for LaViDa for evaluating LaViDa based models.
Also, for some reason for LLaVA based models, you need to use an old version of LLaVA, for more information, check this issue
- in the PathGen-LLaVA paper the reported accuracy is quite low (~60.1) but I got different results.
A huge shoutout to @jacklishufan et al for LaViDa and answering all my stupid questions, @superjamessyx et al for PathGen and PathMMU and my Boss Anant for all the support and guidance.


