misalignment in the point cloud visualization and bounding box returned from LLM

Dear author,

Thanks for sharing the excellent work.

While running the `llava/eval/run_llava_3d.py` on the dataset `./demo/scannet/scene0356_00` to get the bounding box of the bed and the window, I want to visualize the predictions in the scene. I used the sequences of `images_tensor`, `depths_tensor`, `poses_tensor`, `intrinsics_tensor` to get the point clouds visualization, and then visualize the bounding box from the LLM prediction. 

I suspect there is a coordinate shift between the point cloud visualization and the LLM prediction, so the bounding box is not aligned with the associated objects as shown in the image below.

After viewing the code, I suspect this misalignment might come from the depth cropping and scale and how the feature is backprojected in `RGBDVideoTower`, while I do not find a fix for this misalignment. Can you suggest a fix for this or share a sample code to visualize the point clouds and predicted bounding box where they are alignment?

Appreciate for looking into this issue.

![Image](https://github.com/user-attachments/assets/6e8aa2e9-8dff-41ad-94d5-82d668ebd9c5)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

misalignment in the point cloud visualization and bounding box returned from LLM #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

misalignment in the point cloud visualization and bounding box returned from LLM #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions