Category: Publications

  • From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

    Tao Liu, Dafeng Zhang, Gengchen Li, Shizhuo Liu, Yongqi Song, Senmao Li, Shiqi Yang, Boqian Li, Kai Wang, Yaxing Wang Read Full Paper → Face aging has become a crucial task in computer vision, with applications ranging from entertainment to healthcare. However, existing methods struggle with achieving a realistic and seamless transformation across the entire lifespan, especially when handling large age gaps or extreme […]

  • Accurate and Efficient Low-Rank Model Merging in Core Space

    Aniello Panariello, Daniel Marczak, Simone Magistri, Angelo Porrello, Bartłomiej Twardowski, Andrew D. Bagdanov, Simone Calderara, Joost van de Weijer Read Full Paper → In this paper, we address the challenges associated with merging low-rank adaptations of large neural networks. With the rise of parameter-efficient adaptation techniques, such as Low-Rank Adaptation (LoRA), model fine-tuning has become more accessible. While fine-tuning models with LoRA […]

  • Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

    Jiang Qin, Senmao Li, Alexandra Gomez-Villa, Shiqi Yang, Yaxing Wang, Kai Wang, Joost van de Weijer Read Full Paper → Recent advances in Text-to-Image (T2I) diffusion models have transformed image generation, enabling significant progress in stylized generation using only a few style reference images. However, current diffusion-based methods struggle with fine-grained style customization due to challenges in controlling multiple style attributes, […]

  • Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

    Dipam Goswami, Simone Magistri, Kai Wang, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer Read Full Paper → Using pre-trained models has been found to reduce the effect of data heterogeneity and speed up federated learning algorithms. Recent works have investigated the use of first-order statistics and second-order statistics to aggregate local client data distributions at the server and […]

  • Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

    Imad Eddine Marouf, Enzo Tartaglione, Stephane Lathuiliere, Joost van de Weijer Read Full Paper → Continual Learning in Visual Question Answering (VQACL) requires models to acquire new visual-linguistic skills (plasticity) while preserving previously learned knowledge (stability). The inherent multimodality of VQACL exacerbates this challenge, as models must balance stability across visual and textual domains while adapting to novel […]

  • Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing

    Taihang Hu, Linxuan Li, Kai Wang, Yaxing Wang, Jian Yang, Ming-Ming Cheng Read Full Paper → Text-to-image generation has seen groundbreaking advancements with diffusion models, enabling high-fidelity synthesis and precise image editing through cross-attention manipulation. Recently, autoregressive (AR) models have re-emerged as powerful alternatives, leveraging next-token generation to match diffusion models. However, existing editing techniques designed for diffusion models fail […]

  • Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers

    Filip Szatkowski, Yaoyue Zheng, Fei Yang, Bartłomiej Twardowski, Tomasz Trzciński, Joost van de Weijer Read Full Paper → Continual learning is crucial for applying machine learning in challenging, dynamic, and often resource-constrained environments. However, catastrophic forgetting – overwriting previously learned knowledge when new information is acquired – remains a major challenge. In this work, we examine the intermediate representations in […]

  • No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

    Daniel Marczak, Simone Magistri, Sebastian Cygert, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer Read Full Paper → Model merging integrates the weights of multiple task-specific models into a single multi-task model. Despite recent interest in the problem, a significant performance gap between the combined and single-task models remains. In this paper, we investigate the key characteristics of task […]

  • One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

    Senmao Li, Lei Wang, Kai Wang, Tao Liu, Jiehang Xie, Joost van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang Read Full Paper → Text-to-Image (T2I) diffusion models have made remarkable advancements in generative modeling; however, they face a trade-off between inference speed and image quality, posing challenges for efficient deployment. Existing distilled T2I models can generate high-fidelity images with fewer […]

  • The Art of Deception: Color Visual Illusions and Diffusion Models

    Alex Gomez-Villa, Kai Wang, Alejandro C. Parraga, Bartlomiej Twardowski, Jesus Malo, Javier Vazquez-Corral, Joost van de Weijer Read Full Paper → Visual illusions in humans arise when interpreting out-of-distribution stimuli: if the observer is adapted to certain statistics, perception of outliers deviates from reality. Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions. This […]