PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners

Yijia Xiao1*, Yiqiao Jin2, Yushi Bai3, Yue Wu1, Xianjun Yang4, Xiao Luo1, Wenchao Yu5, Xujiang Zhao5, Yanchi Liu5, Quanquan Gu1, Haifeng Chen5, Wei Wang1, Wei Cheng5
1University of California, Los Angeles, 2Georgia Institute of Technology, 3Tsinghua University, 4University of California, Santa Barbara, 5NEC Laboratories America

Abstract

The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning of LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time.

To address this challenge, we introduce Contextual Privacy Protection Language Models (PrivacyMind), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy.

Our work offers a theoretical analysis for model design and benchmarks various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples stands out as a promising method, effectively protecting private data while enhancing the model's knowledge.

Our work underscores the potential for Large Language Models as robust contextual privacy protection learners. The complete code and data for the work can be found at https://github.com/Yijia-Xiao/PrivacyMind.

Overview

PrivacyMind provides a novel approach to fine-tuning LLMs for contextual privacy protection, combining methods like corpus curation, penalty-based loss, and instruction-based tuning. It balances privacy and model performance, allowing LLMs to retain domain-specific knowledge while protecting sensitive information—ideal for privacy-sensitive fields like healthcare and finance.

Penalty-Based Loss Adds constraints to suppress PII generation, using unigram and bigram penalties to selectively forget sensitive information.

PII Classifier A lightweight, context-sensitive classifier that identifies PII tokens in real-time without altering core model outputs, enhancing privacy without quality loss.

Overview of PrivacyMind

Figure: Overview of PrivacyMind's Instruction Tuning Approach

Baselines

PrivacyMind introduces Contextual Privacy Protection Language Models (CPPLM), addressing the challenge of incorporating domain knowledge while emphasizing privacy protection for contextually sensitive PII.


Vanilla Tuning: Involves using explicit prompts to guide LLMs to avoid generating personally identifiable information (PII), thereby enhancing privacy protection.

Vanilla

Figure: Vanilla Tuning of LLM

PII Removal: Ensures no access to sensitive data during training, but may lead to sentence incoherence.

PII Substitution: Uses tokens like ⟨NAME⟩ to maintain structure, balancing privacy and sentence coherence.

Corpus

Figure: Corpus Curation in PrivacyMind

Results

Effectiveness of PrivacyMind's Approach

  • Performance Metrics: Achieved high scores on ROUGE-1, ROUGE-2, ROUGE-L, and BERTScore for answer quality and SPriv for privacy leakage.
  • Utility-Privacy Balance: Instruction-based tuning consistently achieved optimal trade-offs between utility and privacy.
Results on our PQA Dataset
Strategy ROUGE-1 ROUGE-2 ROUGE-L BERTScore Spriv:Name ΔName Spriv:Email ΔEmail Spriv:Address ΔAddress Spriv:SSN ΔSSN
Vanilla0.6370.57430.62350.86990.0778-0.0752-0.0782-0.0724-
Removal0.61480.55750.61150.83900.0410-47.30%0.0394-47.61%0.0423-45.91%0.0419-42.13%
Substitution0.62910.52340.62170.85760.0420-46.02%0.0418-44.41%0.0446-42.97%0.0419-42.13%
IT0.63950.54290.62530.86860.0449-42.29%0.0418-44.41%0.0449-42.58%0.0421-41.85%
ITPN10.64970.55910.63460.86960.0395-49.23%0.0397-47.21%0.0419-46.42%0.0411-43.23%
ITPN20.63240.55690.62220.8690.0404-48.07%0.0403-46.41%0.0421-46.16%0.0413-42.96%
ITNP10.63210.57400.62340.86050.0411-47.17%0.0412-45.21%0.0431-44.88%0.0414-42.82%
ITNP20.63350.57610.62010.86570.0406-47.81%0.0408-45.74%0.0412-47.31%0.0416-42.54%

Privacy and Performance Comparison on PQA Dataset

Further evaluation highlights PrivacyMind's privacy-preserving performance across sensitive categories, including names, emails, addresses, and SSNs.

Performance and Privacy Metrics Comparison on the PQA Dataset
Strategy ROUGE-1 ROUGE-2 ROUGE-L BERTScore Spriv:Name ΔName Spriv:Email ΔEmail Spriv:Address ΔAddress Spriv:SSN ΔSSN
Vanilla0.33420.21740.32970.81620.1082-0.1024-0.1103-0.0992-
Removal0.29470.20710.30920.79830.0558-48.43%0.0536-47.66%0.0573-48.05%0.0568-42.75%
Substitution0.29830.20730.31730.80120.0572-47.13%0.0569-44.43%0.0586-46.87%0.0568-42.75%
Private Transformer0.31720.20960.31920.81190.0551-49.08%0.0554-45.90%0.0572-48.14%0.0569-42.65%
IT_PN0.32730.21120.32210.81010.0549-49.26%0.0551-46.19%0.0570-48.32%0.0570-42.55%
IT_NP0.32610.21620.32520.81320.0563-47.97%0.0561-45.21%0.0575-47.87%0.0573-42.24%

Analysis

In this section, we analyze ROUGE, BERTScore, and Privacy Leakage Score with respect to training steps to assess the effectiveness of our primary learning objectives.

Training Metrics Analysis for ITP N1

Training Metrics Analysis for ITP N1

ROUGE and BERTScore steadily rise, showing knowledge injection, while Privacy Leakage initially increases but declines as the model learns privacy protection.

Privacy Score Evolution During Training

Privacy Score Evolution During Training

Privacy leakage decreases as privacy-preserving instructions are applied, balancing data protection with utility.

Vanilla vs. Instruction Tuning Comparison

Vanilla vs. Instruction Tuning Comparison

Vanilla tuning shows increasing privacy leakage with training, while instruction tuning reduces leakage over time, maintaining similar utility.

Privacy Protection Across Specific PII Types

Privacy Protection Across Specific PII Types

Effective protection across various PII types (e.g., Names, Emails, Addresses, SSNs), demonstrating robust privacy preservation.

Pareto Analysis on Wikidoc Patient Information - LLama 7B

Pareto Analysis on Wikidoc Patient Information - LLama 7B

Analyzes utility vs. privacy trade-offs on the 7B model, highlighting optimal configurations for privacy and performance.

Pareto Analysis on Wikidoc Patient Information - LLama 13B

Pareto Analysis on Wikidoc Patient Information - LLama 13B

Explores trade-offs on the 13B model, showing configurations that balance privacy protection with enhanced utility.

Conclusion

Significance: PrivacyMind represents a pivotal advancement in LLM privacy protection, highlighting that contextual privacy awareness can be effectively learned and implemented.

Call to Action: We encourage researchers and developers to adopt and refine PrivacyMind methodologies, contributing to responsible LLM usage in privacy-sensitive domains.