Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Abstract

Fine-tuning based concept erasing has demonstrated promising results in preventing generation of harmful contents from text-to-image diffusion models by removing target concepts while preserving remaining concepts. To maintain the generation capability of diffusion models after concept erasure, it is necessary to remove only the image region containing the target concept when it locally appears in an image, leaving other regions intact. However, prior arts often compromise fidelity of the other image regions in order to erase the localized target concept appearing in a specific area, thereby reducing the overall performance of image generation. To address these limitations, we first introduce a framework called localized concept erasure, which allows for the deletion of only the specific area containing the target concept in the image while preserving the other regions. As a solution for the localized concept erasure, we propose a training-free approach, dubbed Gated Low-rank adaptation for Concept Erasure (GLoCE), that injects a lightweight module into the diffusion model. GLoCE consists of low-rank matrices and a simple gate, determined only by several generation steps for concepts without training. By directly applying GLoCE to image embeddings and designing the gate to activate only for target concepts, GLoCE can selectively remove only the region of the target concepts, even when target and remaining concepts coexist within an image. Extensive experiments demonstrated GLoCE not only improves the image fidelity to text prompts after erasing the localized target concepts, but also outperforms prior arts in efficacy, specificity, and robustness by large margin and can be extended to mass concept erasure.

Overview of Gated Low-rank adaptation
for Concept Erasure (GLoCE)

Illustration of overall results of concept erasing after erasing 50 celebrities by a baseline and ours. To preserve generation capability after concept erasing, it is essential to maintain high fidelity for remaining concepts even when target concepts are included in same text prompts. However, baselines often struggle to achieve the fidelity. The proposed method, GLoCE, significantly improves this fidelity while demonstrating strong performance in efficacy, specificity, and robustness, which are key conditions for effective erasure.

Low-rankedness of Features for a Concept in Diffusion Model

To verify the low-rankedness of features for a concept, we collected embeddings for each layer within diffusion models from generation of a few images of concepts using SD v1.4. From a few generations of the concept, we obtained thousands of token embeddings from a single forward pass of the model and repeated multiple diffusion timesteps. Then, we analyzed the spectrum of these stacked embeddings by singular value decomposition (SVD). This figure illustrates the spectrum analysis of embeddings from diverse concepts. Notably, we observed that only a small number of singular values are significant along various concepts.

Closed-form LoRA for Concept Erasing via Principal Components of a Target Concept

(a) Extraction of principal components for each layer in a diffusion model by generation of few samples to determine the low-rank matrices for erasing target concepts. We construct the mean and primary direction of the distribution of token embeddings of target and surrogate concepts (b) Closed-form LoRA for concept erasing. It projects the target embeddings to the low-rank subspace of mapping concepts after removing the information of target concept, inspired by linear guardedness.

Inference-Only Update of Gate to Enhance
Efficacy and Specificity

Gate mechanism to enhance the efficacy for target concepts and specificity for remaining concepts. The parameters in the gate is determined only by generation of few images.

Qualitative Results on Localized Celebs Erasure

Qualitative results of baselines and ours on localized celebrities erasure to evaluate the fidelity of the generated images. It shows that erasing only one target concept can degrade the fidelity of image containing both target and remaining celebrity on baselines, while GLoCE effectively erase the region of features of target concepts and successfully preserves the other region.

Quantitative Results on Localized Celebs Erasure

Comparison of baselines and the proposed method for image fidelity on text prompts containing target celebrities and remaining celebrities. We measured accuracy in percentage and there harmonic mean of efficacy and specificity following MACE

Qualitative Results on Celebs Erasure
for Prompts Only Containing One Concept

Qualitative comparison on celebrities erasure when prompts contain either target or remaining concepts. To further verify the efficacy and specificity of the proposed method, we evaluated erasure and preservation performance across various domains with prompts only containing either target or remaining concepts.

Quantitative Results on Celebs Erasure
for Prompts Only Containing One Concept

Quantitative results on celebrities erasure. We used CS and GCD accuracy in percentage (ACC_t for target and ACC_r for remaining concepts). We also measured FID for COCO-30K, or KID (scaled by 100) for the other remaining concepts.

Results on Artistic Styles Erasure

Qualitative results of efficacy for target style and fidelity of remaining celebrity along the number of erased targets.

Results on Explicit Contents Erasure

Results of detected number of explicit contents using NudeNet detector on I2P and preservation performance on MS-COCO 30K with CS, FID. GLoCE outperforms the efficacy on explicit contents by a large margin while achieving the best specificity on COCO-30K.

Illustration of Gate Activation Map

Qualitative illustration of gate activation map for target concepts. It shows that the gate is precisely activated on the spatially local region of target concepts through multiple layers in a diffusion model and DDIM time steps. Through the local activation of gate, GLoCE can successfully erase the local region of target concepts.

BibTeX


      @InProceedings{lee2025gloce,
          author    = {Lee, Byung Hyun and Lim, Sungjin and Chun, Se Young},
          title     = {Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation},
          booktitle = {CVPR}
          year      = {2025},
      }