Skip to content

Mismatch performance on CelebAMask-HQ test set #37

@ChaoLi977

Description

@ChaoLi977

Hello, thank you for your great work.

But I run the face parsing model on the CelebAMask-HQ test set (2824 images) and got the below performance.

The model I run is the farl/celebm/448, face_parsing.farl.celebm.main_ema_181500_jit.pt.
F1 scores: {'background': 0.9343307778499743, 'skin': 0.9641438432481969, 'nose': 0.9377685027511485, 'eye_g': 0.8991579940116652, 'l_eye': 0.8797685119013225, 'r_eye': 0.8815088490017493, 'l_brow': 0.8546936399701022, 'r_brow': 0.8517906024905171, 'l_ear': 0.8826971414311515, 'r_ear': 0.8796045818209585, 'mouth': 0.9227481788076385, 'u_lip': 0.8879356316268103, 'l_lip': 0.9040920760745508, 'hair': 0.935249390735524, 'hat': 0.8693470068443545, 'ear_r': 0.697250254530866, 'neck_l': 0.3732396631852335, 'neck': 0.8658552106253891, 'cloth': 0.8273804800814614, 'fg_mean': 0.8507906421743688}

It seems the mean f is 85.07, which is not match with the 89.56 reported in the paper.

Moreover, the necklace performance is very low, 37.32, which is much smaller than the 69.72 in paper.

Can you help me to figure out the reason?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions