This module was used to extract scene, face and subtitle (dialogue) features from the MovieGraphs dataset.
Note: we have already provided all the extracted features (refer ../README.md).
Therefore, you need not extract the features again.
Make sure your working directory is in project root and not inside
feature_extractors
module
Download the required checkpoint from here and save it in the directory mentioned at config.yaml:saved_model_path
.
This module is separate from EmoTx.
For feature extraction, we require python>=3.8
. Therefore, you will have to create another environment for this.
Using Coda-
$ conda create -n mg_ft_extr python=3.8
$ conda activate mg_ft_extr
(mg_ft_extr) $ pip install -r feature_extractors/feat_extraction_requirements_py38.txt
Or using pip (make sure you have python3.8
installed)-
$ python3.8 -m pip install virtualenv
$ python3.8 -m virtualenv mg_ft_extr
$ source mg_ft_extr/bin/activate
(mg_ft_extr) $ pip install -r feature_extractors/feat_extraction_requirements_py38.txt
- Extract features from MViT_V1 pre-trained on Kinetics400 dataset
(mg_ft_extr) $ python -m feature_extractors.action_feat_extractor scene_feat_type="mvit_v1"
- Extract features from ResNet50 pre-trained on Places365 dataset
(mg_ft_extr) $ python -m feature_extractors.scene_feat_extractor scene_feat_type="resnet50_places"
- Extract features from ResNet152 pre-trained on ImageNet dataset
(mg_ft_extr) $ python -m feature_extractors.scene_feat_extractor scene_feat_type="generic"
If you wish to extract the dialogue features from a fine-tuned RoBERTa (fine-tuned using ../finetune_roberta.py),
Make sure the fine-tuned model with the appropriate name is saved in the path mentioned in config.yaml:saved_model_path
The command line argument srt_feat_pretrained=True
implies we will use the pre-trained RoBERTa checkpoint whereas srt_feat_pretrained=False
implies we will use the fine-tuned RoBERTa checkpoint.
- Extract utterance level features from the fine-tuned RoBERTa model-
(mg_ft_extr) $ python -m feature_extractors.srt_features_extractor srt_feat_type="independent" srt_feat_pretrained=False
- Extract concatenated utterance features from the fine-tuned RoBERTa model-
(mg_ft_extr) $ python -m feature_extractors.srt_features_extractor srt_feat_type="concat" srt_feat_pretrained=False
- Extract utterance level features from pre-trained RoBERTa-base checkpoint-
(mg_ft_extr) $ python -m feature_extractors.srt_features_extractor srt_feat_type="independent" srt_feat_pretrained=True
- Extract concatenated utterance features from pre-trained RoBERTa-base checkpoint-
(mg_ft_extr) $ python -m feature_extractors.srt_features_extractor srt_feat_type="concat" srt_feat_pretrained=True
We use MTCNN for Face detection and a CascadeRCNN pre-trained with MovieNet annotations for Person detection. We first detect the person bbox and then detect faces within the person box. This minimizes the false-positive face detections.
- Edit the
config.yaml:char_detection.save_path
- Perform character detection-
(mg_ft_extr) $ python -m feature_extractors.character_detector
- Once the detection is over, proceed to character tracking
(mg_ft_extr) $ python -m feature_extractors.character_tracker
Note: This will generate a directory named character_tracks/
in the config.yaml:save_path
. Move it inside the data/
directory that is mentioned at config.yaml:data_path
.
Make sure you have performed character detection and tracking before performing this and moved the character_tracks/
directory to data/
directory.
- Extract face features from InceptionResNet_v1 pre-trained on VGG-Face2 dataset-
(mg_ft_extr) $ python -m feature_extractors.face_feat_extractor face_feat_type="generic"
- Extract face features from ResNet50 pre-trained on SFEW, FER13 and VGG-Face datasets
(mg_ft_extr) $ python -m feature_extractors.face_feat_extractor face_feat_type="resnet50_fer"
- Extract face features from VGG-m pre-trained on FER13 and VGG-Face datasets
(mg_ft_extr) $ python -m feature_extractors.face_feat_extractor face_feat_type="emo"