Skip to content

Running Mask RCNN

Aditya Agarwal edited this page Jun 12, 2020 · 3 revisions

Training (PSC)

  1. Build Docker image:
docker build -t maskrcnn_benchmark .
singularity build $SCRATCH/maskrcnn_benchmark.simg  docker://thecatalyst25/maskrcnn_benchmark
  1. Start a job
interact -p GPU-small --gpu -t 8:00:00 --ntasks-per-node=1

or

interact -p GPU-AI --gres=gpu:volta16:4 -t 8:00:00  --ntasks-per-node=4

or

interact -p GPU --gres=gpu:p100:2 -t 8:00:00  --ntasks-per-node=2
  1. Load singularity image
source /etc/profile.d/modules.sh
module load singularity
export SINGULARITY_CACHEDIR=$SCRATCH/.singularity
singularity shell -B /pylon5/ir5fq3p/likhache/aditya:/data/aditya --nv  $SCRATCH/maskrcnn_benchmark_2.simg
  1. Run the code :
cd /data/aditya/fb_mask_rcnn/maskrcnn-benchmark
source runme_ycb_mask_train

or

cd /data/aditya/fb_mask_rcnn/maskrcnn-benchmark
source runme_ycb_mask_multigpu_train 2
  1. Start tensorboard (ssh -L localhost:6006:localhost:6006 username@host in another terminal) :
singularity shell -B /pylon5/ir5fq3p/likhache/aditya:/data/aditya --nv  $SCRATCH/maskrcnn_benchmark_2.simg
cd /pylon5/ir5fq3p/likhache/aditya/fb_mask_rcnn/maskrcnn-benchmark
python -m tensorboard.main --logdir ./logs
Clone this wiki locally