Semantic Generative Tuning for Unified Multimodal Models Paper • 2605.18714 • Published 7 days ago • 10
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 21 days ago • 120
EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing Paper • 2603.14916 • Published Mar 16 • 2
AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model Paper • 2604.19747 • Published Apr 21 • 39
River-LLM: Large Language Model Seamless Exit Based on KV Share Paper • 2604.18396 • Published Apr 20 • 6
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization Paper • 2604.09574 • Published Feb 24 • 30
Automating Database-Native Function Code Synthesis with LLMs Paper • 2604.06231 • Published Apr 2 • 17
From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation Paper • 2603.15600 • Published Mar 16 • 7
EmbTracker: Traceable Black-box Watermarking for Federated Language Models Paper • 2603.12089 • Published Mar 12 • 2
CMIC: Content-Adaptive Mamba for Learned Image Compression Paper • 2508.02192 • Published Aug 4, 2025 • 1
S2CFormer: Revisiting the RD-Latency Trade-off in Transformer-based Learned Image Compression Paper • 2502.00700 • Published Feb 2, 2025
H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting Paper • 2408.13036 • Published Aug 23, 2024
Implicit-explicit Integrated Representations for Multi-view Video Compression Paper • 2311.17350 • Published Nov 29, 2023
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation Paper • 2602.17100 • Published Feb 19 • 4
ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data Paper • 2603.00516 • Published Feb 28 • 1
SurfSplat: Conquering Feedforward 2D Gaussian Splatting with Surface Continuity Priors Paper • 2602.02000 • Published Feb 2
Grounding and Enhancing Informativeness and Utility in Dataset Distillation Paper • 2601.21296 • Published Jan 29 • 19
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published Feb 2 • 96
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published Jan 27 • 81