📚 Paper Review - 2026-05-26

# 📚 Daily Paper Review - 2026-05-26

Found **10** relevant papers today. Please review and approve/reject.

---

## 1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes

**Score:** `5.0/10` | **arXiv:** [2605.23602v1](http://arxiv.org/abs/2605.23602v1)

**Authors:** Beibei Lin, Xiao Cao, Jingyuan Guo...

**Relevance:**
- 🎯 Field Match: 1.69/10 - Matches: 3d gaussian, gaussian splatting
- 🏆 Venue: CVPR (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Existing 3DGS methods effectively render high-quality novel views in clear-day scenes. However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction. To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing st...

**Key Contributions:**
- Existing 3DGS methods effectively render high-quality novel views in clear-day scenes.
- However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction.
- To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing structural cues.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23602v1) | [📥 PDF](https://arxiv.org/pdf/2605.23602v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception

**Score:** `4.8/10` | **arXiv:** [2605.23655v1](http://arxiv.org/abs/2605.23655v1)

**Authors:** Liupeng Li, Haoqian Kang, Zhenyu Lu...

**Relevance:**
- 🎯 Field Match: 0.0/10 - Matches: 
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs). While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency. Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational...

**Key Contributions:**
- High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs).
- While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency.
- Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational redundancy and semantic fragmentation.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23655v1) | [📥 PDF](https://arxiv.org/pdf/2605.23655v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry

**Score:** `4.8/10` | **arXiv:** [2605.23840v1](http://arxiv.org/abs/2605.23840v1)

**Authors:** Adam Tlemsani, Yingdian Li, Maxime Giot...

**Relevance:**
- 🎯 Field Match: 0.51/10 - Matches: segmentation
- 🏆 Venue: MICCAI (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings. We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixe...

**Key Contributions:**
- Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings.
- We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixel 4x4 Mueller matrices.
- To scale pre-training, we collected a new large Multispectral Animal Polarimetric Organ dataset (MAP-Org).

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23840v1) | [📥 PDF](https://arxiv.org/pdf/2605.23840v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 4. Learning a Particle Dynamics Model with Real-world Videos

**Score:** `4.6/10` | **arXiv:** [2605.23845v1](http://arxiv.org/abs/2605.23845v1)

**Authors:** Chanho Kim, Suhas V. Sumukh, Li Fuxin

**Relevance:**
- 🎯 Field Match: 0.85/10 - Matches: gaussian splatting
- 🏆 Venue: CVPR (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature. Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies. However, these models are typically...

**Key Contributions:**
- Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature.
- Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies.
- However, these models are typically trained in simulated environments because obtaining perfect state information such as complete scene point clouds and point correspondences over time is challenging in real-world settings.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23845v1) | [📥 PDF](https://arxiv.org/pdf/2605.23845v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 5. Relevant Walk Search for Explaining Graph Neural Networks

**Score:** `4.4/10` | **arXiv:** [2605.23673v1](http://arxiv.org/abs/2605.23673v1)

**Authors:** Ping Xiong, Thomas Schnake, Michael Gastegger...

**Relevance:**
- 🎯 Field Match: 0.0/10 - Matches: 
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness. Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lowe...

**Key Contributions:**
- Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness.
- Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lower-order, i.
- e.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23673v1) | [📥 PDF](https://arxiv.org/pdf/2605.23673v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

**Score:** `4.2/10` | **arXiv:** [2605.23901v1](http://arxiv.org/abs/2605.23901v1)

**Authors:** Xu Ouyang, Deyi Liu, Yuhang Cai...

**Relevance:**
- 🎯 Field Match: 0.0/10 - Matches: 
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
  We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a...

**Key Contributions:**
- Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
- We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem.
- By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23901v1) | [📥 PDF](https://arxiv.org/pdf/2605.23901v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers

**Score:** `4.1/10` | **arXiv:** [2605.23892v1](http://arxiv.org/abs/2605.23892v1)

**Authors:** Shuhong Zheng, Michael Oechsle, Erik Sandström...

**Relevance:**
- 🎯 Field Match: 0.59/10 - Matches: 3d reconstruction
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available

**AI Summary:**
Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner. However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models. This limits both their scalability and efficiency. In this work, we address this ch...

**Key Contributions:**
- Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner.
- However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models.
- This limits both their scalability and efficiency.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23892v1) | [📥 PDF](https://arxiv.org/pdf/2605.23892v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

**Score:** `4.0/10` | **arXiv:** [2605.23888v1](http://arxiv.org/abs/2605.23888v1)

**Authors:** Katharina Schmid, Nicolas von Lützow, Jozef Hladký...

**Relevance:**
- 🎯 Field Match: 0.0/10 - Matches: 
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available

**AI Summary:**
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completen...

**Key Contributions:**
- We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior.
- We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents.
- Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23888v1) | [📥 PDF](https://arxiv.org/pdf/2605.23888v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model

**Score:** `3.9/10` | **arXiv:** [2605.23790v1](http://arxiv.org/abs/2605.23790v1)

**Authors:** Romaric Mazna, Jean Martinet, Sai Deepesh Pokala

**Relevance:**
- 🎯 Field Match: 1.1/10 - Matches: self-supervised, deep learning
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention. In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras. Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the ...

**Key Contributions:**
- Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention.
- In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras.
- Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the lack of a strong baseline.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23790v1) | [📥 PDF](https://arxiv.org/pdf/2605.23790v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---

## 10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox

**Score:** `3.9/10` | **arXiv:** [2605.23747v1](http://arxiv.org/abs/2605.23747v1)

**Authors:** Allan Kazakov, Duygu Cakir, Hilal Kurt İrfanoğlu...

**Relevance:**
- 🎯 Field Match: 0.93/10 - Matches: segmentation, computer vision
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned

**AI Summary:**
Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing. Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-b...

**Key Contributions:**
- Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing.
- Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-biased foundation models.
- In this paper, we revive the Apple-DMS benchmark to establish a modern Vision Transformer baseline.

**Links:** [📄 Paper](http://arxiv.org/abs/2605.23747v1) | [📥 PDF](https://arxiv.org/pdf/2605.23747v1)

**Actions:**
- ✅ Approve: Add label `approved` and comment "approve"
- ❌ Reject: Add label `rejected` and comment "reject"
- ⭐ Important: Add label `starred`

---


## How to Review

1. Read the summaries above
2. Check paper links for more details
3. Add labels to indicate your decision:
   - `approved` - Add to collection
   - `rejected` - Skip this paper
   - `starred` - Mark as particularly important
4. Comment "approve" or "reject" to trigger automation

**Note:** Papers with `approved` label will be automatically added to the collection.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📚 Paper Review - 2026-05-26 #244

📚 Daily Paper Review - 2026-05-26

1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes

2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception

3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry

4. Learning a Particle Dynamics Model with Real-world Videos

5. Relevant Walk Search for Explaining Graph Neural Networks

6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers

8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model

10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox

How to Review

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

📚 Paper Review - 2026-05-26 #244

Description

📚 Daily Paper Review - 2026-05-26

1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes

2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception

3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry

4. Learning a Particle Dynamics Model with Real-world Videos

5. Relevant Walk Search for Explaining Graph Neural Networks

6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers

8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model

10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox

How to Review

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions