Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis

1Department of ECE, 2IPAI, 3INMC, Seoul National University
4Department of Pathology, College of Medicine, Seoul National University
† Corresponding author

Accepted at ICCV 2025

Abstract

Multiple instance learning (MIL) significantly reduced annotation costs via bag-level weak labels for large-scale images, such as histopathological whole slide images (WSIs). However, its adaptability to continual tasks with minimal forgetting has been rarely explored, especially on instance classification for localization. Weakly incremental learning for semantic segmentation has been studied for continual localization, but it focused on natural images, leveraging global relationships among hundreds of small patches (e.g., $16 \times 16$) using pre-trained models. This approach seems infeasible for MIL localization due to enormous amounts ($\sim 10^5$) of large patches (e.g., $256 \times 256$) and no available global relationships such as cancer cells. To address these challenges, we propose Continual Multiple Instance Learning with Enhanced Localization (CoMEL), an MIL framework for both localization and adaptability with minimal forgetting. CoMEL consists of (1) Grouped Double Attention Transformer (GDAT) for efficient instance encoding, (2) Bag Prototypes-based Pseudo-Labeling (BPPL) for reliable instance pseudo-labeling, and (3) Orthogonal Weighted Low-Rank Adaptation (OWLoRA) to mitigate forgetting in both bag and instance classification. Extensive experiments on three public WSI datasets demonstrate superior performance of CoMEL, outperforming the prior arts by up to $11.00\%$ in bag-level accuracy and up to $23.4\%$ in localization accuracy under the continual MIL setup.

Illustration of Continual Multiple instance learning with Enhanced Localization (CoMEL)

{"overview"}

In this work, we propose Continual Multiple instance learning with Enhanced Localization (CoMEL). Compared to the baseline (ConSlide), it alleviates forgetting of localization under multiple instance learning while effectively learning new task.

Overview of Method Components in CoMEL

{"overview"}

Overview of Continual Multiple instance learning with Enhanced Localization (CoMEL). It aims to alleviate the forgetting of both bag and instance classification on previous tasks under the MIL setup, while effectively learning new tasks. It consists of three key components: Grouped Double Attention Transformer (GDAT), Bag Prototypes-based Pseudo-Labeling (BPPL), and Orthogonal Weighted Low-Rank Adapatation (OWLoRA).

Grouped Double Attention Transformer (GDAT) and Bag Prototypes-based Pseudo-Labeling (BPPL)

{"overview"}

Components of (a) Grouped Double Attention Transformer (GDAT) and (b) Bag Prototypes-based Pseudo-Labeling (BPPL) for enhanced localization. GDAT utilizes two sequential efficient attention with small number of grouped tokens by averaging instances in same region into one token. For BPPL, we normalize the attention as predicted class probability and obtain both positive and negative prototypes for pseudo-labeling, and then filter the pseudo-labels via the performance of bag and instance classification.

Orthogonal Weighted Low-Rank Adaptation (OWLoRA)

{"overview"}

Illustration of Orthogonal Weighted Low-Rank Adaptation (OWLoRA) for continual MIL. For orthogonality of 1st task, OWLoRA trains the full-rank weights and then extracts principal orthogonal components by singular value decomposition. For subsequent tasks, it imposes the intra- and inter-orthogonality on the bases for low-rank learnable matrices for the current task.

Quantitative Results on Continual Instance Classification in MIL

{"overview"}

Quantitative results of CL methods on instance classification in the continual MIL setup. The best and second best results are marked as bold and underline. Each experiment consisted of 10 runs. We conducted the experiments on five sequential organ datasets from combined CM-16 and PAIP. For baselines, we applied the CL approaches upon our GDAT+BPPL, except for ConSlide. All metrics were measured in percentage. CoMEL achieved the highest performance across all metrics while minimizing the forgetting.

Illustration of Continual Instance Classification over Tasks

{"overview"}

Instance-level classification accuracy over sequentially learned organs from combined CM-16 and PAIP. Each figure shows the forgetting for a specific organ under continual MIL setup. Task Index indicates the order of organs in the sequential tasks. CoMEL consistently achieved superior performance across all tasks, effectively mitigating catastrophic forgetting.

Qualitative Results on Continual Instance Classification in MIL

{"overview"}

Qualitative results of localization across sequential organ datasets under continual MIL setup. Each column shows localization performance on Task 1 (left) or 2 (right) as the learned organ changes over sequential tasks. Each row corresponds to CL methods including CoMEL. CoMEL successfully preserved the localization quality across all tasks, while baselines increase false positives or false negatives.

Additional Qualitative Results - 1

{"overview"}

Additional qualitative results of localization across sequential organ datasets under the continual MIL setup. Each column is the localization performance on Task 1 as the learned organ changes over sequential tasks. Each row corresponds to CL methods including CoMEL. CoMEL successfully preserved the localization quality across all tasks, while baselines increase false positives or false negatives.

Additional Qualitative Results - 2

{"overview"}

Additional qualitative results of localization across sequential organ datasets under the continual MIL setup. Each column is the localization performance on Task 2 as the learned organ changes over sequential tasks. Each row corresponds to CL methods including CoMEL. CoMEL successfully preserved the localization quality across all tasks, while baselines increase false positives or false negatives.

BibTeX


      @inproceedings{lee2025gloce,
          author    = {Lee, Byung Hyun and Jeong, Wongi and Han, Woojae and Lee, Kyoungbun and Chun, Se Young},
          title     = {Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis},
          booktitle = {ICCV}
          year      = {2025},
      }