LaViDa-Pathgen

World's First Diffusion Model based Visual Language Model for Pathology based on LaViDa, trained on PathGen-1.6M dataset and finetuned on PathGen-Instruct datasets.

Dataset:

Download GDC client. Download the required WSI using download_wsi_using_gdc_client.sh. Download the PathGen-1.6M.json which has wsi id, position and captions, once you have the WSIs, use create_img_txt_pairs_for_pathgen.py to create image-text pairs.
Download the VQA dataset from jamessyx/PathGen-Instruct.

You can directly download the dataset for Stage 1 and Stage 2 from here in the format required for training.

Transformers compatible weights (HF)

Inference

Download checkpoint from https://huggingface.co/himanshunitrr/LaViDa-Pathgen You can infer using predict.py

LaViDa Setup:

git clone https://github.com/Himanshunitrr/LaViDa-PathGen.git
cd LaViDa
conda create --name lavida python=3.13
conda activate lavida
pip install -e .[train]
cd eval
pip install -e .
cd ../
pip install trl==0.17.0

Training

Stage 1 Pretraining

IMG_PATH is the path to the images DATA_PATH is the path to the stage-1 dataset (json file)

You can view the wandb.ai log for this stage at this link


LaViDa-PathGen/LaViDa/scripts/train/exps/cluster/pretrain_llada.sh

Stage 2 Finetuning

For Stage 2 finetuning, you will need mm_projector.bin which you will get from Stage 1 training. If you just want to do Stage 2 finetuning, you can download the mm_projector.bin from here.

IMG_PATH is the path to the images DATA_PATH is the path to the stage-2 dataset (json file)

You can view the wandb.ai log for this stage at this link

LaViDa-PathGen/LaViDa/scripts/train/exps/cluster/llada-hd-llada-s2.sh

Evaluation

PathMMU

To evaluate the model on PathMMU use main.py

Use the conda environment you created earlier for LLaVA for evaluating LLaVA based models and use the conda environment you created for LaViDa for evaluating LaViDa based models.

Also, for some reason for LLaVA based models, you need to use an old version of LLaVA, for more information, check this issue

in the PathGen-LLaVA paper the reported accuracy is quite low (~60.1) but I got different results.

Thanks

A huge shoutout to @jacklishufan et al for LaViDa and answering all my stupid questions, @superjamessyx et al for PathGen and PathMMU and my Boss Anant for all the support and guidance.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LaViDa		LaViDa
PathMMU-main		PathMMU-main
curate_dataset_from_scratch		curate_dataset_from_scratch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gif_create.py		gif_create.py
image_text_animation.gif		image_text_animation.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LaViDa-Pathgen

Dataset:

Transformers compatible weights (HF)

Inference

LaViDa Setup:

Training

Stage 1 Pretraining

Stage 2 Finetuning

Evaluation

Thanks

About

Uh oh!

Releases

Packages

Languages

License

Himanshunitrr/LaViDa-PathGen

Folders and files

Latest commit

History

Repository files navigation

LaViDa-Pathgen

Dataset:

Transformers compatible weights (HF)

Inference

LaViDa Setup:

Training

Stage 1 Pretraining

Stage 2 Finetuning

Evaluation

Thanks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages