LLaVA-3D-Instruct-860K.json Question： IndexError when training on official dataset: index out of bounds for prompt_features

**Training with the official LLaVA-3D-Instruct-860K.json dataset fails with IndexError: index 0 is out of bounds for dimension 0 with size 0.**

This is error message
`Traceback (most recent call last):
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/accelerate/accelerator.py", line 924, in accumulate
    yield
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/transformers/trainer.py", line 1869, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/transformers/trainer.py", line 2772, in training_step
    loss = self.compute_loss(model, inputs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/transformers/trainer.py", line 2795, in compute_loss
    outputs = model(**inputs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/accelerate/utils/operations.py", line 581, in forward
    return model_forward(*args, **kwargs)
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/accelerate/utils/operations.py", line 569, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/mnt/sda/shenhao/conda/envs/llava-3d/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "/mnt/sda/shenhao/code/LLaVA-3D/llava/model/language_model/llava_llama.py", line 86, in forward
    ) = self.prepare_inputs_labels_for_multimodal(
  File "/mnt/sda/shenhao/code/LLaVA-3D/llava/model/llava_arch.py", line 332, in prepare_inputs_labels_for_multimodal
    cur_prompt_features = prompt_features[cur_prompt_idx].unsqueeze(0)  # (1, C)
IndexError: index 0 is out of bounds for dimension 0 with size 0`

I think the reason is 
  Mismatch between data format and code assumptions:When the conversation contains <boxes> tokens but no corresponding 3D box data exists, the code tries to access an empty prompt_features tensor, causing an out-of-bounds error.


  **Execution Flow**
  1. DataLoader: If a sample lacks target.boxes data (due to wrong field name or empty list), clicks = []
  2. Collate: prompt_features = encode_prompts(clicks) → shape (0, 3) (empty tensor)
  3. Tokenizer: <boxes> in conversation gets converted to LOC_TOKEN_INDEX tokens
  4. Model: num_prompts > 0, loops to access prompt_features[0], [1]... but tensor is empty!
  5. Error: IndexError: index 0 is out of bounds for dimension 0 with size 0

and this is   problematic data sample
`{
        "id": 838720,
        "video": "scannet/scene0069_00",
        "conversations": [
            {
                "value": "<video>\nIdentify the object according to the following description.\nThe door's neighbor is a long radiator.\nThere may be no corresponding object, or there may be one or more objects.",
                "from": "human"
            },
            {
                "value": "Answer: <boxes>.",
                "from": "gpt"
            }
        ],
        "box": [],
        "metadata": {
            "dataset": "multi3drefer",
            "question_type": "zt_wo_d",
            "ann_id": 8,
            "object_id": []
        }
    },
{
        "id": 858841,
        "video": "scannet/scene0502_00",
        "conversations": [
            {
                "value": "<video>\nIdentify the object according to the following description.\nA whiteboard-fronting office chair sits in the room.\nThere may be no corresponding object, or there may be one or more objects.",
                "from": "human"
            },
            {
                "value": "Answer: <boxes>.",
                "from": "gpt"
            }
        ],
        "box": [
            [
                -0.3842504620552063,
                0.2782879201695323,
                0.6257577687501907,
                0.630746603012085,
                0.6165543142706156,
                0.6325612962245941
            ]
        ],
        "metadata": {
            "dataset": "multi3drefer",
            "question_type": "st_w_d",
            "ann_id": 56,
            "object_id": [
                1
            ]
        }
    }
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaVA-3D-Instruct-860K.json Question： IndexError when training on official dataset: index out of bounds for prompt_features #41

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LLaVA-3D-Instruct-860K.json Question： IndexError when training on official dataset: index out of bounds for prompt_features #41

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions