Skip to content

📚 Paper Review - 2026-05-26 #244

@github-actions

Description

@github-actions

📚 Daily Paper Review - 2026-05-26

Found 10 relevant papers today. Please review and approve/reject.


1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes

Score: 5.0/10 | arXiv: 2605.23602v1

Authors: Beibei Lin, Xiao Cao, Jingyuan Guo...

Relevance:

  • 🎯 Field Match: 1.69/10 - Matches: 3d gaussian, gaussian splatting
  • 🏆 Venue: CVPR (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Existing 3DGS methods effectively render high-quality novel views in clear-day scenes. However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction. To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing st...

Key Contributions:

  • Existing 3DGS methods effectively render high-quality novel views in clear-day scenes.
  • However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction.
  • To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing structural cues.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception

Score: 4.8/10 | arXiv: 2605.23655v1

Authors: Liupeng Li, Haoqian Kang, Zhenyu Lu...

Relevance:

  • 🎯 Field Match: 0.0/10 - Matches:
  • 🏆 Venue: ICML (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs). While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency. Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational...

Key Contributions:

  • High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs).
  • While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency.
  • Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational redundancy and semantic fragmentation.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry

Score: 4.8/10 | arXiv: 2605.23840v1

Authors: Adam Tlemsani, Yingdian Li, Maxime Giot...

Relevance:

  • 🎯 Field Match: 0.51/10 - Matches: segmentation
  • 🏆 Venue: MICCAI (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings. We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixe...

Key Contributions:

  • Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings.
  • We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixel 4x4 Mueller matrices.
  • To scale pre-training, we collected a new large Multispectral Animal Polarimetric Organ dataset (MAP-Org).

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

4. Learning a Particle Dynamics Model with Real-world Videos

Score: 4.6/10 | arXiv: 2605.23845v1

Authors: Chanho Kim, Suhas V. Sumukh, Li Fuxin

Relevance:

  • 🎯 Field Match: 0.85/10 - Matches: gaussian splatting
  • 🏆 Venue: CVPR (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature. Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies. However, these models are typically...

Key Contributions:

  • Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature.
  • Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies.
  • However, these models are typically trained in simulated environments because obtaining perfect state information such as complete scene point clouds and point correspondences over time is challenging in real-world settings.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

5. Relevant Walk Search for Explaining Graph Neural Networks

Score: 4.4/10 | arXiv: 2605.23673v1

Authors: Ping Xiong, Thomas Schnake, Michael Gastegger...

Relevance:

  • 🎯 Field Match: 0.0/10 - Matches:
  • 🏆 Venue: ICML (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness. Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lowe...

Key Contributions:

  • Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness.
  • Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lower-order, i.
  • e.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Score: 4.2/10 | arXiv: 2605.23901v1

Authors: Xu Ouyang, Deyi Liu, Yuhang Cai...

Relevance:

  • 🎯 Field Match: 0.0/10 - Matches:
  • 🏆 Venue: ICML (10/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a...

Key Contributions:

  • Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
  • We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem.
  • By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers

Score: 4.1/10 | arXiv: 2605.23892v1

Authors: Shuhong Zheng, Michael Oechsle, Erik Sandström...

Relevance:

  • 🎯 Field Match: 0.59/10 - Matches: 3d reconstruction
  • 🏆 Venue: None (5.0/10)
  • 💻 Code: ✅ Available

AI Summary:
Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner. However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models. This limits both their scalability and efficiency. In this work, we address this ch...

Key Contributions:

  • Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner.
  • However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models.
  • This limits both their scalability and efficiency.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

Score: 4.0/10 | arXiv: 2605.23888v1

Authors: Katharina Schmid, Nicolas von Lützow, Jozef Hladký...

Relevance:

  • 🎯 Field Match: 0.0/10 - Matches:
  • 🏆 Venue: None (5.0/10)
  • 💻 Code: ✅ Available

AI Summary:
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completen...

Key Contributions:

  • We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior.
  • We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents.
  • Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model

Score: 3.9/10 | arXiv: 2605.23790v1

Authors: Romaric Mazna, Jean Martinet, Sai Deepesh Pokala

Relevance:

  • 🎯 Field Match: 1.1/10 - Matches: self-supervised, deep learning
  • 🏆 Venue: None (5.0/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention. In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras. Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the ...

Key Contributions:

  • Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention.
  • In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras.
  • Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the lack of a strong baseline.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox

Score: 3.9/10 | arXiv: 2605.23747v1

Authors: Allan Kazakov, Duygu Cakir, Hilal Kurt İrfanoğlu...

Relevance:

  • 🎯 Field Match: 0.93/10 - Matches: segmentation, computer vision
  • 🏆 Venue: None (5.0/10)
  • 💻 Code: ❌ Not mentioned

AI Summary:
Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing. Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-b...

Key Contributions:

  • Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing.
  • Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-biased foundation models.
  • In this paper, we revive the Apple-DMS benchmark to establish a modern Vision Transformer baseline.

Links: 📄 Paper | 📥 PDF

Actions:

  • ✅ Approve: Add label approved and comment "approve"
  • ❌ Reject: Add label rejected and comment "reject"
  • ⭐ Important: Add label starred

How to Review

  1. Read the summaries above
  2. Check paper links for more details
  3. Add labels to indicate your decision:
    • approved - Add to collection
    • rejected - Skip this paper
    • starred - Mark as particularly important
  4. Comment "approve" or "reject" to trigger automation

Note: Papers with approved label will be automatically added to the collection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions