Skip to content

Commit

Permalink
Update description of HumanData (English and Chinese) (#356)
Browse files Browse the repository at this point in the history
* Update human_data.md

* Update human_data.md

* Update human_data.md

* fix

* minor docs update English & Chinese

* minor docs update English & Chinese

---------

Co-authored-by: wei-chen-hub <[email protected]>
  • Loading branch information
Wei-Chen-hub and wei-chen-hub authored Jul 3, 2023
1 parent 46dc586 commit 0e1f101
Show file tree
Hide file tree
Showing 2 changed files with 112 additions and 17 deletions.
66 changes: 58 additions & 8 deletions docs/human_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,69 @@ HumanData is a subclass of python built-in class dict, containing single-view, i

### Key/Value definition

#### The keys and values supported by HumanData are described as below.
#### Paths:

Image path is included, and optionally path of segmentation map and depth image can be included if provided by dataset.
- image_path: (N, ), list of str, each element is a relative path from the root folder (exclusive) to the image.
- segmentation (optional): (N, ), list of str, each element is a relative path from the root folder (exclusive) to the segmentation map.
- depth_path (optional): (N, ), list of str, each element is a relative path from the root folder (exclusive) to the depth image.

#### Keypoints:

Following keys should be included in `HumanData` if applicable. For each dictionary key of keypoints,a corresponding dictionart key of mask should be included,stating which keypoint is valid. For example `keypoints3d_original` should correspond to `keypoints3d_original_mask`.

In `HumanData`, keypoints are stored as `HUMAN_DATA` format, which includes 190 joints. We provide keypoints format (for both 2d and 3d keypoints) convention for many datasets, please see [keypoints_convention](../docs/keypoints_convention.md).

- keypoints3d_smpl / keypoints3d_smplx: (N, 190, 4), numpy array, `smplx / smplx` 3d joints with confidence, joints from each datasets are mapped to HUMAN_DATA joints.
- keypoints3d_original: (N, 190, 4), numpy array, 3d joints with confidence which provided by the dataset originally, joints from each datasets are mapped to HUMAN_DATA joints.
- keypoints2d_smpl / keypoints2d_smplx: (N, 190, 3), numpy array, `smpl / smplx` 2d joints with confidence, joints from each datasets are mapped to HUMAN_DATA joints.
- keypoints2d_original: (N, 190, 3), numpy array, 2d joints with confidence which provided by the dataset originally, joints from each datasets are mapped to HUMAN_DATA joints.
- (mask sample) keypoints2d_smpl_mask: (190, ), numpy array, mask for which keypoint is valid in `keypoints2d_smpl`. 0 means that the joint in this position cannot be found in original dataset.

#### Bounding Box:

Bounding box of body (smpl), face and hand (smplx), which data type is `[x_min, y_min, width, height, confidence]`,and should not exceed the image boundary.
- bbox_xywh: (N, 5), numpy array, bounding box with confidence, coordinates of bottom-left point x, y, width w and height h of bbox, score at last.
- face_bbox_xywh, lhand_bbox_xywh, rhand_bbox_xywh (optional): (N, 5), numpy array, should be included if `smplx` data is provided, and is derived from smplx2d keypoints. Have the same srtucture as above.

#### Human Pose and Shape Parameters:

Normally saved as smpl/smplx.
- smpl: (1, ), dict, keys are `['body_pose': numpy array, (N, 23, 3), 'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3)]`.
- smplx: (1, ), dict, keys are `['body_pose': numpy array, (N, 21, 3),'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3), 'left_hand_pose': numpy array, (N, 15, 3), 'right_hand_pose': numpy array, (N, 15, 3), 'expression': numpy array (N, 10), 'leye_pose': numpy array (N, 3), 'reye_pose': (N, 3), 'jaw_pose': numpy array (N, 3)]`.


#### Other keys

- config: (), str, the flag name of config for individual dataset.
- keypoints2d: (N, 190, 3), numpy array, 2d joints of smplx model with confidence, joints from each datasets are mapped to HUMAN_DATA joints.
- keypoints3d: (N, 190, 4), numpy array, 3d joints of smplx model with confidence. Same as above.
- smpl: (1, ), dict, keys are ['body_pose': numpy array, (N, 23, 3), 'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3)].
- smplx: (1, ), dict, keys are ['body_pose': numpy array, (N, 21, 3),'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3), 'left_hand_pose': numpy array, (N, 15, 3), 'right_hand_pose': numpy array, (N, 15, 3), 'expression': numpy array (N, 10), 'leye_pose': numpy array (N, 3), 'reye_pose': (N, 3), 'jaw_pose': numpy array (N, 3)].
- meta: (1, ), dict, its keys are meta data from dataset like 'gender'.
- keypoints2d_mask: (190, ), numpy array, mask for which keypoint is valid in keypoints2d. 0 means that the joint in this position cannot be found in original dataset.
- keypoints3d_mask: (190, ), numpy array, mask for which keypoint is valid in keypoints3d. 0 means that the joint in this position cannot be found in original dataset.
- misc: (1, ), dict, keys and values are defined by user. The space misc takes(sys.getsizeof(misc)) shall be no more than 6MB.
- misc: (1, ), dict, keys and values are designed to describe the different settings for each dataset. Can also be defined by user. The space misc takes (sys.getsizeof(misc)) shall be no more than 6MB.

#### Suggestion for WHAT to include in `HumanData['misc']`:

Miscellaneous contains the info of different settings for each dataset, including camaera type, source of keypoints annotation, bounding box etc. Aims to faclitate different usage of data.
`HumanData['misc']` is a dictionary and its keys are described as following:
- kps3d_root_aligned: Bool, stating that if keypoints3d is root-aligned,root_alignment is not preferred for HumanData. If this key does not exist, root_aligenment is by default to be `False`.
- flat_hand_mean:Bool, applicable for smplx data,for most datasets `flat_hand_mean=False`.
- bbox_source:source of bounding box,`bbox_soruce='keypoints2d_smpl' or 'keypoints2d_smplx' or 'keypoints2d_original'`,describing which type of keypoints are used to derive the bounding box,OR `bbox_source='provide_by_dataset'` shows that bounding box if provided by dataset. (For example, from some detection module rather than keypoints)
- bbox_body_scale: applicable if bounding box is derived by keypoints,stating the zoom-in scale of bounding scale from smpl/smplx/2d_gt keypoints,we suggest `bbox_body_scale=1.2`.
- bbox_hand_scale, bbox_face_scale: applicable if bounding box is derived by smplx keypoints,stating the zoom-in scale of bounding scale from smplx/2d_gt keypoints,we suggest `bbox_hand_scale=1.0, bbox_face_scale=1.0`
- smpl_source / smplx_source: describing the source of smpl/smplx annotations,`'original', 'nerual_annot', 'eft', 'osx_annot', 'cliff_annot'`.
- cam_param_type: describing the type of camera parameters,`cam_param_type='prespective' or 'predicted_camera' or 'eft_camera'`
- principal_point, focal_length: (1, 2), numpy array,applicable if camera parameters are same across the whole dataset, which is the case for some synthetic datasets.
- image_shape: (1, 2), numpy array,applicable if image shape are same across the whole dataset.

#### Suggestion for WHAT to include in `HumanData['meta']`:

- gender: (N, ), list of str, each element represents the gender for an smpl/smplx instance. (key not required if dataset use gender-neutral model)
- height (width):(N, ), list of str, each element represents the height (width) of an image, `image_shape=(width, height): (N, 2)` is not suggested as width and height might need to be referenced in different orders. (keys should be in `HumanData['misc']` if image shape are same across the dataset)
- other keys,applicable if the key value is different,and have some impact on keypoints or smpl/smplx (2d and 3d),For example, `focal_length` and `principal_point`, focal_length = (N, 2), principal_point = (N, 2)

#### Some other info of HumanData

- All annotations are transformed from world space to opencv camera space, for space transformation we use:

```from mmhuman3d.models.body_models.utils import transform_to_camera_frame, batch_transform_to_camera_frame```

#### Key check in HumanData.

Expand Down
63 changes: 54 additions & 9 deletions docs_zh-CN/human_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,66 @@
`HumanData`是Python内置字典的子类,主要用于存放包含人体的单视角图像的信息。它具有通用的基础结构,也兼容具有新特性的客制化数据。
原生的`HumanData`包含`numpy.ndarray`或其他的Python内置的数据结构,但不包含`torch.Tensor`的数据。可以使用`human_data.to()`将其转换为`torch.Tensor`(支持CPU和GPU)。

### `Key/Value`的定义
### `Key/Value`的定义:如下是对`HumanData`支持的`Key``Value`的描述.

#### 如下是对`HumanData`支持的`Key``Value`的描述.
#### 路径:

通常包含图片路径,如果数据集有提供额外的深度或者分割图,也可以记录下来。
- image_path: (N, ), 字符串组成的列表, 每一个元素是图像相对于根目录的路径。
- segmantation_path (可选): (N, ), 字符串组成的列表, 每一个元素是图像分割图相对于根目录的路径。
- depth_path (可选): (N, ), 字符串组成的列表, 每一个元素是图像深度图相对于根目录的路径。

#### 关键点:

以下关键点keys如果适用,则应包含在HumanData中。任何一个关键点的key,应存在一个mask,表示其中哪些关键点有效。如`keypoints3d_original`应对应`keypoints3d_original_mask`
`HumanData` 中的关键点存储格式为`HUMAN_DATA`, 包含190个关键点。MMHuman3d中提供了很多常用关键点格式的转换(2d及3d均支持), 详见 [keypoints_convention](../docs_zh-CN/keypoints_convention.md).
- keypoints3d_smpl / keypoints3d_smplx: (N, 190, 4), numpy array, `smplx / smplx`模型的3d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- keypoints3d_original: (N, 190, 4), numpy array, 由数据集本身提供的3d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- keypoints2d_smpl / keypoints2d_smplx: (N, 190, 3), numpy array, `smpl / smplx`模型的2d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- keypoints2d_original: (N, 190, 3), numpy array, 由数据集本身提供的2d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- (mask示例) keypoints2d_smpl_mask: (190, ), numpy array, 表示`keypoints2d_smpl`中关键点是否有效的掩膜。 0表示该位置的关键点在原始数据集中无法找到。

#### 检测框:

身体(smpl),手脸(smplx)的检测框,标注为`[x_min, y_min, width, height, confidence]`,且不应超出图片。
- bbox_xywh: (N, 5), numpy array, 边界框的置信度, 边界框左下角点的坐标x和y, 边界框的宽w和高h, 置信度得分放置在最后。
- config: (), 字符串, 单个数据集的配置的标志。
- keypoints2d: (N, 190, 3), numpy array, `smplx`模型的2d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- keypoints3d: (N, 190, 4), numpy array, `smplx`模型的3d关节点与置信度, 每一个数据集的关节点映射到了`HUMAN_DATA`的关节点。
- face_bbox_xywh, lhand_bbox_xywh, rhand_bbox_xywh(可选): (N, 5), numpy array, 如果数据标注中含有`smplx`, 则应包括这三个key,由smplx2d关键点得出,格式同上。

#### 人体模型参数:

通常以smpl/smplx格式存储。
- smpl: (1, ), 字典, `keys` 分别为 ['body_pose': numpy array, (N, 23, 3), 'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3)].
- smplx: (1, ), 字典, `keys` 分别为 ['body_pose': numpy array, (N, 21, 3),'global_orient': numpy array, (N, 3), 'betas': numpy array, (N, 10), 'transl': numpy array, (N, 3), 'left_hand_pose': numpy array, (N, 15, 3), 'right_hand_pose': numpy array, (N, 15, 3), 'expression': numpy array (N, 10), 'leye_pose': numpy array (N, 3), 'reye_pose': (N, 3), 'jaw_pose': numpy array (N, 3)].
- meta: (1, ), 字典, `keys` 为数据集中类似性别的元数据。
- keypoints2d_mask: (190, ), numpy array, 表示`keypoints2d`中关键点是否有效的掩膜。 0表示该位置的关键点在原始数据集中无法找到。
- keypoints3d_mask: (190, ), numpy array, 表示`keypoints3d`中关键点是否有效的掩膜。 0表示该位置的关键点在原始数据集中无法找到。
- misc: (1, ), 字典, `keys``values`由用户定义。`misc`占用的空间(可以通过`sys.getsizeof(misc)`获取)不能超过6MB。

#### 其它keys

- config: (), 字符串, 单个数据集的配置的标志。
- meta: (1, ), 字典, `keys`为数据集中的各种元数据。
- misc: (1, ), 字典, `keys`为数据集中各种独特设定,也可以由用户自定义。`misc`占用的空间(可以通过`sys.getsizeof(misc)`获取)不能超过6MB。

#### `HumanData['misc']`中建议(可能)包含的内容:
Miscellaneous部分中包含了每个数据集的独特设定,包括相机种类,关键点标注来源,检测框来源,是否包含smpl/smplx标注等等,用于便利数据读取。
`HumanData['misc']`中包含一个dictionary,建议包含的key如下所示:
- kps3d_root_aligned: Bool 描述keypoints3d是否经过root align,建议不进行root_alignment,如果不包含这个key,则默认没有进行过root_aligenment
- flat_hand_mean:Bool 对于smplx标注的数据,应该存在此项,大多数数据集中`flat_hand_mean=False`
- bbox_source:描述检测框的来源,`bbox_soruce='keypoints2d_smpl' or 'keypoints2d_smplx' or 'keypoints2d_original'`,描述检测框是由哪种关键点得出的,或者`bbox_source='provide_by_dataset'`表示检测框由数据集直接给出(比如用其自带检测器生成而不是由关键点推导得出)
- bbox_body_scale: 如果检测框由关键点推导得出,则应包含此项,描述由smpl/smplx/2d_gt关键点推导出的身体检测框的放大比例,建议`bbox_body_scale=1.2`
- bbox_hand_scale, bbox_face_scale: 如果检测框由关键点推导得出,则应包含这两项,描述由smpl/smplx/2d_gt关键点推导出的身体检测框的放大比例,建议`bbox_hand_scale=1.0, bbox_face_scale=1.0`
- smpl_source / smplx_source: 描述smpl/smplx的来源,`'original', 'nerual_annot', 'eft', 'osx_annot', 'cliff_annot'`, 来描述smpl/smnplx是来源于数据集提供,或者其它标注来源
- cam_param_type: 描述相机参数的种类,`cam_param_type='prespective' or 'predicted_camera' or 'eft_camera'`
- principal_point, focal_length: (1, 2), numpy array,如果数据集中相机参数恒定,则应包含这两项,通常适用于生成数据集。
- image_shape: (1, 2), numpy array,如果数据集中图片大小恒定,则应包含此项。

#### `HumanData['meta']`中建议(可能)包含的内容:
- gender: (N, ), 字符串组成的列表, 每一个元素是smplx模型的性别(中性则不必标注)
- height(width):(N, ), 字符串组成的列表, 每一个元素是图片的高(或宽),这里不推荐使用`image_shape=(width, height): (N, 2)`,因为有时需要按反顺序读取图片格式。(数据集图片分辨率一致则应标注在`HumanData['misc']`中)
- 其它有标识性的key,若数据集中该key不一致,且会影响keypoints or smpl/smplx,则建议标注,如focal_length与principal_point, focal_length = (N, 2), principal_point = (N, 2)

#### 关于HumanData的一些说明

- 所有数据标注均已从世界坐标转移到opencv相机空间,进行smpl/smplx的相机空间转换可以用

```from mmhuman3d.models.body_models.utils import transform_to_camera_frame, batch_transform_to_camera_frame```

#### 检查`HumanData`中的`key`.

Expand Down

0 comments on commit 0e1f101

Please sign in to comment.