📚 Daily Paper Review - 2026-05-26
Found 10 relevant papers today. Please review and approve/reject.
1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes
Score: 5.0/10 | arXiv: 2605.23602v1
Authors: Beibei Lin, Xiao Cao, Jingyuan Guo...
Relevance:
- 🎯 Field Match: 1.69/10 - Matches: 3d gaussian, gaussian splatting
- 🏆 Venue: CVPR (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Existing 3DGS methods effectively render high-quality novel views in clear-day scenes. However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction. To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing st...
Key Contributions:
- Existing 3DGS methods effectively render high-quality novel views in clear-day scenes.
- However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction.
- To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing structural cues.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception
Score: 4.8/10 | arXiv: 2605.23655v1
Authors: Liupeng Li, Haoqian Kang, Zhenyu Lu...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs). While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency. Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational...
Key Contributions:
- High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs).
- While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency.
- Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational redundancy and semantic fragmentation.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry
Score: 4.8/10 | arXiv: 2605.23840v1
Authors: Adam Tlemsani, Yingdian Li, Maxime Giot...
Relevance:
- 🎯 Field Match: 0.51/10 - Matches: segmentation
- 🏆 Venue: MICCAI (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings. We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixe...
Key Contributions:
- Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings.
- We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixel 4x4 Mueller matrices.
- To scale pre-training, we collected a new large Multispectral Animal Polarimetric Organ dataset (MAP-Org).
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
4. Learning a Particle Dynamics Model with Real-world Videos
Score: 4.6/10 | arXiv: 2605.23845v1
Authors: Chanho Kim, Suhas V. Sumukh, Li Fuxin
Relevance:
- 🎯 Field Match: 0.85/10 - Matches: gaussian splatting
- 🏆 Venue: CVPR (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature. Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies. However, these models are typically...
Key Contributions:
- Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature.
- Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies.
- However, these models are typically trained in simulated environments because obtaining perfect state information such as complete scene point clouds and point correspondences over time is challenging in real-world settings.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
5. Relevant Walk Search for Explaining Graph Neural Networks
Score: 4.4/10 | arXiv: 2605.23673v1
Authors: Ping Xiong, Thomas Schnake, Michael Gastegger...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness. Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lowe...
Key Contributions:
- Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness.
- Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lower-order, i.
- e.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
Score: 4.2/10 | arXiv: 2605.23901v1
Authors: Xu Ouyang, Deyi Liu, Yuhang Cai...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: ICML (10/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a...
Key Contributions:
- Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
- We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem.
- By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers
Score: 4.1/10 | arXiv: 2605.23892v1
Authors: Shuhong Zheng, Michael Oechsle, Erik Sandström...
Relevance:
- 🎯 Field Match: 0.59/10 - Matches: 3d reconstruction
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available
AI Summary:
Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner. However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models. This limits both their scalability and efficiency. In this work, we address this ch...
Key Contributions:
- Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner.
- However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models.
- This limits both their scalability and efficiency.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
Score: 4.0/10 | arXiv: 2605.23888v1
Authors: Katharina Schmid, Nicolas von Lützow, Jozef Hladký...
Relevance:
- 🎯 Field Match: 0.0/10 - Matches:
- 🏆 Venue: None (5.0/10)
- 💻 Code: ✅ Available
AI Summary:
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completen...
Key Contributions:
- We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior.
- We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents.
- Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model
Score: 3.9/10 | arXiv: 2605.23790v1
Authors: Romaric Mazna, Jean Martinet, Sai Deepesh Pokala
Relevance:
- 🎯 Field Match: 1.1/10 - Matches: self-supervised, deep learning
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention. In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras. Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the ...
Key Contributions:
- Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention.
- In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras.
- Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the lack of a strong baseline.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox
Score: 3.9/10 | arXiv: 2605.23747v1
Authors: Allan Kazakov, Duygu Cakir, Hilal Kurt İrfanoğlu...
Relevance:
- 🎯 Field Match: 0.93/10 - Matches: segmentation, computer vision
- 🏆 Venue: None (5.0/10)
- 💻 Code: ❌ Not mentioned
AI Summary:
Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing. Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-b...
Key Contributions:
- Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing.
- Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-biased foundation models.
- In this paper, we revive the Apple-DMS benchmark to establish a modern Vision Transformer baseline.
Links: 📄 Paper | 📥 PDF
Actions:
- ✅ Approve: Add label
approved and comment "approve"
- ❌ Reject: Add label
rejected and comment "reject"
- ⭐ Important: Add label
starred
How to Review
- Read the summaries above
- Check paper links for more details
- Add labels to indicate your decision:
approved - Add to collection
rejected - Skip this paper
starred - Mark as particularly important
- Comment "approve" or "reject" to trigger automation
Note: Papers with approved label will be automatically added to the collection.
📚 Daily Paper Review - 2026-05-26
Found 10 relevant papers today. Please review and approve/reject.
1. GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes
Score:
5.0/10| arXiv: 2605.23602v1Authors: Beibei Lin, Xiao Cao, Jingyuan Guo...
Relevance:
AI Summary:
Existing 3DGS methods effectively render high-quality novel views in clear-day scenes. However, they struggle with night scenes, particularly in glow regions, due to the lack of structural features such as textures and edges, which are key cues for splatting-based reconstruction. To address this problem, we leverage a diffusion model and a Vision Foundation Model (VFM) to compensate for missing st...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred2. CVSearch: Empowering Multimodal LLMs with Cognitive Visual Search for High-Resolution Image Perception
Score:
4.8/10| arXiv: 2605.23655v1Authors: Liupeng Li, Haoqian Kang, Zhenyu Lu...
Relevance:
AI Summary:
High-resolution (HR) image perception presents a key bottleneck for multimodal large language models (MLLMs). While visual search offers a promising solution, existing methods struggle with the trade-off between coverage and efficiency. Visual expert-assisted search is efficient but prone to blind spots when proposals fail, whereas scan-based search guarantees coverage at the cost of computational...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred3. MuellerPT: Decomposition Driven Pretraining for Dense Learning in Mueller Polarimetry
Score:
4.8/10| arXiv: 2605.23840v1Authors: Adam Tlemsani, Yingdian Li, Maxime Giot...
Relevance:
AI Summary:
Mueller matrix imaging provides rich, physically meaningful contrast for biomedical tissue analysis, but supervised learning is hindered by scarce dense annotations and strong domain shifts across specimens and acquisition settings. We introduce MuellerPT, a physics guided pre-training approach that learns transferable dense representations by predicting Lu-Chipman decomposition maps from per-pixe...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred4. Learning a Particle Dynamics Model with Real-world Videos
Score:
4.6/10| arXiv: 2605.23845v1Authors: Chanho Kim, Suhas V. Sumukh, Li Fuxin
Relevance:
AI Summary:
Data-driven learning approaches for physics simulation, sometimes referred to as world models, have emerged as promising alternatives to traditional physics simulators due to their differentiable nature. Prior work has demonstrated impressive results in predicting the motions of rigid and non-rigid objects in complex scenes involving multiple interacting bodies. However, these models are typically...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred5. Relevant Walk Search for Explaining Graph Neural Networks
Score:
4.4/10| arXiv: 2605.23673v1Authors: Ping Xiong, Thomas Schnake, Michael Gastegger...
Relevance:
AI Summary:
Graph Neural Networks (GNNs) have become important machine learning tools for graph analysis, and its explainability is crucial for safety, fairness, and robustness. Layer-wise relevance propagation for GNNs (GNN-LRP) evaluates the relevance of \emph{walks} to reveal important information flows in the network, and provides higher-order explanations, which have been shown to be superior to the lowe...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred6. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
Score:
4.2/10| arXiv: 2605.23901v1Authors: Xu Ouyang, Deyi Liu, Yuhang Cai...
Relevance:
AI Summary:
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred7. Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers
Score:
4.1/10| arXiv: 2605.23892v1Authors: Shuhong Zheng, Michael Oechsle, Erik Sandström...
Relevance:
AI Summary:
Visual geometry transformers have become powerful architectures for multi-view 3D reconstruction, enabling joint prediction of multiple 3D attributes in a feed-forward manner. However, their computational cost grows quadratically with the input sequence length due to the global attention layers inside these models. This limits both their scalability and efficiency. In this work, we address this ch...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred8. GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
Score:
4.0/10| arXiv: 2605.23888v1Authors: Katharina Schmid, Nicolas von Lützow, Jozef Hladký...
Relevance:
AI Summary:
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completen...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred9. Exploring deep learning for Event-Based Saliency Prediction with a Transformer-based model
Score:
3.9/10| arXiv: 2605.23790v1Authors: Romaric Mazna, Jean Martinet, Sai Deepesh Pokala
Relevance:
AI Summary:
Saliency prediction has been extensively studied in RGB images and videos as a computational model of human visual attention. In contrast, predicting saliency from event-based data remains largely unexplored, despite the biological inspiration and favorable sensing properties of event cameras. Two obstacles have held this direction back: the absence of large-scale event saliency datasets, and the ...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starred10. Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox
Score:
3.9/10| arXiv: 2605.23747v1Authors: Allan Kazakov, Duygu Cakir, Hilal Kurt İrfanoğlu...
Relevance:
AI Summary:
Material segmentation, the pixel-wise classification of physical surface properties, remains a challenging problem in computer vision, requiring physicochemical understanding distinct from object-centric parsing. Despite the introduction of the rigorous Apple Dense Material Segmentation (DMS) dataset, the benchmark has suffered from attrition and stagnation, increasingly overshadowed by geometry-b...
Key Contributions:
Links: 📄 Paper | 📥 PDF
Actions:
approvedand comment "approve"rejectedand comment "reject"starredHow to Review
approved- Add to collectionrejected- Skip this paperstarred- Mark as particularly importantNote: Papers with
approvedlabel will be automatically added to the collection.