Tag: ICCV 2025
-
Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering
Imad Eddine Marouf, Enzo Tartaglione, Stephane Lathuiliere, Joost van de Weijer Read Full Paper → Continual Learning in Visual Question Answering (VQACL) requires models to acquire new visual-linguistic skills (plasticity) while preserving previously learned knowledge (stability). The inherent multimodality of VQACL exacerbates this challenge, as models must balance stability across visual and textual domains while adapting to novel […]
-
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu, Linxuan Li, Kai Wang, Yaxing Wang, Jian Yang, Ming-Ming Cheng Read Full Paper → Text-to-image generation has seen groundbreaking advancements with diffusion models, enabling high-fidelity synthesis and precise image editing through cross-attention manipulation. Recently, autoregressive (AR) models have re-emerged as powerful alternatives, leveraging next-token generation to match diffusion models. However, existing editing techniques designed for diffusion models fail […]