The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning of LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time.
To address this challenge, we introduce Contextual Privacy Protection Language Models (PrivacyMind), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy.
Our work offers a theoretical analysis for model design and benchmarks various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples stands out as a promising method, effectively protecting private data while enhancing the model's knowledge.
Our work underscores the potential for Large Language Models as robust contextual privacy protection learners. The complete code and data for the work can be found at https://github.com/Yijia-Xiao/PrivacyMind.
PrivacyMind provides a novel approach to fine-tuning LLMs for contextual privacy protection, combining methods like corpus curation, penalty-based loss, and instruction-based tuning. It balances privacy and model performance, allowing LLMs to retain domain-specific knowledge while protecting sensitive information—ideal for privacy-sensitive fields like healthcare and finance.
Penalty-Based Loss Adds constraints to suppress PII generation, using unigram and bigram penalties to selectively forget sensitive information.
PII Classifier A lightweight, context-sensitive classifier that identifies PII tokens in real-time without altering core model outputs, enhancing privacy without quality loss.
Figure: Overview of PrivacyMind's Instruction Tuning Approach
PrivacyMind introduces Contextual Privacy Protection Language Models (CPPLM), addressing the challenge of incorporating domain knowledge while emphasizing privacy protection for contextually sensitive PII.
Vanilla Tuning: Involves using explicit prompts to guide LLMs to avoid generating personally identifiable information (PII), thereby enhancing privacy protection.
Figure: Vanilla Tuning of LLM
PII Removal: Ensures no access to sensitive data during training, but may lead to sentence incoherence.
PII Substitution: Uses tokens like ⟨NAME⟩ to maintain structure, balancing privacy and sentence coherence.
Figure: Corpus Curation in PrivacyMind
Strategy | ROUGE-1 | ROUGE-2 | ROUGE-L | BERTScore | Spriv:Name | ΔName | Spriv:Email | ΔEmail | Spriv:Address | ΔAddress | Spriv:SSN | ΔSSN |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Vanilla | 0.637 | 0.5743 | 0.6235 | 0.8699 | 0.0778 | - | 0.0752 | - | 0.0782 | - | 0.0724 | - |
Removal | 0.6148 | 0.5575 | 0.6115 | 0.8390 | 0.0410 | -47.30% | 0.0394 | -47.61% | 0.0423 | -45.91% | 0.0419 | -42.13% |
Substitution | 0.6291 | 0.5234 | 0.6217 | 0.8576 | 0.0420 | -46.02% | 0.0418 | -44.41% | 0.0446 | -42.97% | 0.0419 | -42.13% |
IT | 0.6395 | 0.5429 | 0.6253 | 0.8686 | 0.0449 | -42.29% | 0.0418 | -44.41% | 0.0449 | -42.58% | 0.0421 | -41.85% |
ITPN1 | 0.6497 | 0.5591 | 0.6346 | 0.8696 | 0.0395 | -49.23% | 0.0397 | -47.21% | 0.0419 | -46.42% | 0.0411 | -43.23% |
ITPN2 | 0.6324 | 0.5569 | 0.6222 | 0.869 | 0.0404 | -48.07% | 0.0403 | -46.41% | 0.0421 | -46.16% | 0.0413 | -42.96% |
ITNP1 | 0.6321 | 0.5740 | 0.6234 | 0.8605 | 0.0411 | -47.17% | 0.0412 | -45.21% | 0.0431 | -44.88% | 0.0414 | -42.82% |
ITNP2 | 0.6335 | 0.5761 | 0.6201 | 0.8657 | 0.0406 | -47.81% | 0.0408 | -45.74% | 0.0412 | -47.31% | 0.0416 | -42.54% |
Further evaluation highlights PrivacyMind's privacy-preserving performance across sensitive categories, including names, emails, addresses, and SSNs.
Strategy | ROUGE-1 | ROUGE-2 | ROUGE-L | BERTScore | Spriv:Name | ΔName | Spriv:Email | ΔEmail | Spriv:Address | ΔAddress | Spriv:SSN | ΔSSN |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Vanilla | 0.3342 | 0.2174 | 0.3297 | 0.8162 | 0.1082 | - | 0.1024 | - | 0.1103 | - | 0.0992 | - |
Removal | 0.2947 | 0.2071 | 0.3092 | 0.7983 | 0.0558 | -48.43% | 0.0536 | -47.66% | 0.0573 | -48.05% | 0.0568 | -42.75% |
Substitution | 0.2983 | 0.2073 | 0.3173 | 0.8012 | 0.0572 | -47.13% | 0.0569 | -44.43% | 0.0586 | -46.87% | 0.0568 | -42.75% |
Private Transformer | 0.3172 | 0.2096 | 0.3192 | 0.8119 | 0.0551 | -49.08% | 0.0554 | -45.90% | 0.0572 | -48.14% | 0.0569 | -42.65% |
IT_PN | 0.3273 | 0.2112 | 0.3221 | 0.8101 | 0.0549 | -49.26% | 0.0551 | -46.19% | 0.0570 | -48.32% | 0.0570 | -42.55% |
IT_NP | 0.3261 | 0.2162 | 0.3252 | 0.8132 | 0.0563 | -47.97% | 0.0561 | -45.21% | 0.0575 | -47.87% | 0.0573 | -42.24% |
In this section, we analyze ROUGE, BERTScore, and Privacy Leakage Score with respect to training steps to assess the effectiveness of our primary learning objectives.
Training Metrics Analysis for ITP N1
ROUGE and BERTScore steadily rise, showing knowledge injection, while Privacy Leakage initially increases but declines as the model learns privacy protection.
Privacy Score Evolution During Training
Privacy leakage decreases as privacy-preserving instructions are applied, balancing data protection with utility.
Vanilla vs. Instruction Tuning Comparison
Vanilla tuning shows increasing privacy leakage with training, while instruction tuning reduces leakage over time, maintaining similar utility.
Privacy Protection Across Specific PII Types
Effective protection across various PII types (e.g., Names, Emails, Addresses, SSNs), demonstrating robust privacy preservation.
Pareto Analysis on Wikidoc Patient Information - LLama 7B
Analyzes utility vs. privacy trade-offs on the 7B model, highlighting optimal configurations for privacy and performance.
Pareto Analysis on Wikidoc Patient Information - LLama 13B
Explores trade-offs on the 13B model, showing configurations that balance privacy protection with enhanced utility.
Significance: PrivacyMind represents a pivotal advancement in LLM privacy protection, highlighting that contextual privacy awareness can be effectively learned and implemented.
Call to Action: We encourage researchers and developers to adopt and refine PrivacyMind methodologies, contributing to responsible LLM usage in privacy-sensitive domains.