LSA-Probe: Latent Stability Adversarial Probe

The Challenge

Why existing MIA methods fail for audio diffusion models

🎵

High-Dimensional Audio

Audio waveforms are high-dimensional (16kHz × 10s = 160k samples). Traditional loss-based MIA methods struggle with such complex data.

🌊

Diffusion Dynamics

Diffusion models have complex denoising trajectories. Simple likelihood comparisons fail to capture membership signals across timesteps.

⚡

Latent Stability

Members exhibit higher stability in latent space under adversarial perturbations. This is our key insight for improved detection.

Baseline Methods vs LSA-Probe

Diffusion Process & Attack Mechanism

Understanding how we probe membership through latent perturbations

1. Forward Diffusion

xₜ = √(ᾱₜ)x₀ + √(1-ᾱₜ)ε

Add Gaussian noise to clean audio x₀ to get noisy latent xₜ at timestep t

2. Clean Reverse

x̂₀ = Rₜ(xₜ; θ)

Denoise xₜ back to clean reconstruction x̂₀ using the reverse operator Rₜ

3. Attacked Reverse

x̂₀^δ = Rₜ(xₜ + δₜ; θ)

Inject time-normalized perturbation δₜ = σₜδ̃ at xₜ, then denoise to get degraded x̂₀^δ

Key Insights

At timestep t (t≈0.6T): We inject a time-normalized perturbation δₜ = σₜδ̃ where σₜ = √(1-ᾱₜ) matches the forward noise scale
Measure perceptual distance: D(x̂₀, x̂₀^δ) using CDPAM or MR-STFT to quantify degradation in human-perceivable quality
Adversarial cost C_adv: Minimum budget η needed to reach degradation threshold τ. Members require larger η (more stable) than non-members
Mid-trajectory is optimal: Early timesteps (t→0) have too much noise; late timesteps (t→T) are nearly clean. Mid-trajectory (t≈0.6T) shows best separability

Diffusion Model Membership Inference (Waveform-based)

Interactive visualization of how membership inference works through forward diffusion and reverse denoising. Compare reconstruction quality to detect if a sample was in the training set.

x₀ (Clean Spectrogram)

xₜ (Noisy Spectrogram at timestep t)

x̂₀ (Reconstruction after reverse denoise)

Diffusion Timestep (t)

t = 50

Sample Type

Member-case

Reconstruction Distance (Δ)

0.250

✓ Likely Member

Interpretation: When Δ (reconstruction error) is below the threshold τ, the sample is more likely a member (model learned it well). When Δ exceeds τ, it's more likely a non-member (model struggles to reconstruct unseen data).

Two-Loop Adversarial Probe

Measuring latent stability through adaptive perturbation

Core Idea

We measure adversarial cost \(C_{\text{adv}}\): the minimum perturbation budget needed to degrade reconstruction quality below a threshold. Members require higher budgets (more stable) than non-members.

Algorithm

Outer Loop: Binary search for minimum budget \(\eta\) where degradation \(D \geq \tau\)

Inner Loop: PGD optimization to maximize degradation under budget constraint \(\|\delta\|_p \leq \eta\)

Output: Adversarial cost \(C_{\text{adv}} = \eta^*\) (higher = more stable = member)

Interactive Visualization

Sample Type:

Member vs Non-Member Stability

Members reside in more stable regions of the generative manifold, requiring larger perturbation budgets to reach the same degradation threshold

Member: Slower degradation (C_adv ≈ 0.58)

Non-Member: Faster degradation (C_adv ≈ 0.29)

The plot shows degradation D(x̂₀, x̂₀^δ) vs. perturbation budget η. The horizontal dashed line marks the degradation threshold τ. Members (blue curve) require nearly 2× the budget to reach τ compared to non-members (orange curve), providing a robust signal for membership inference.

Why Perceptual Distance?

We use perceptual audio quality metrics instead of simple MSE to measure degradation because they better align with human perception of audio quality changes.

CDPAM

Cognitive Model of Perceptual Audio Quality - captures psychoacoustic phenomena like masking and frequency sensitivity

MR-STFT

Multi-Resolution STFT distance - measures spectral differences across multiple time-frequency resolutions

Log-Mel MSE

Mean squared error on log-mel spectrograms - captures timbral and spectral characteristics

Waveform MSE

Direct waveform comparison - serves as a simple baseline but less perceptually aligned

Why Timestep Matters

Different diffusion timesteps \(t\) reveal different membership signals. Mid-trajectory (t ≈ 0.6T) shows best separability between members and non-members.

Early (t→0)

Nearly pure noise
Low signal

Mid (t≈0.6T)

Best separability
★ Optimal

Late (t→T)

Clear reconstruction
Less distinctive

Interactive Explorer

Explore how adversarial cost distributions change across diffusion timesteps

Diffusion Timestep

t_ratio = 0.6

0.2 0.4 0.6 0.8

Move the slider to see how adversarial costs change with timestep

Adversarial Cost Distribution

Members (Training Data)

Non-Members (Unseen Data)

Separability (AUC): 0.67

ROC Curve

TPR @ 0.1% FPR: 5.1%

TPR @ 1% FPR: 20.0%

TPR @ 5% FPR: 42.0%

Experimental Results

Comprehensive evaluation across models and datasets

Main Results

Budget Ablation

TPR@1%FPR vs. maximum budget \(\eta_{\text{max}}\)

Distance Metric Comparison

Performance across different degradation metrics

Resources

Paper, code, and citation

📄

Paper

Read the full paper on arXiv

View Paper

💻

Code

GitHub repository with implementation

View Code

📊

Data

Experimental data and checkpoints

Download

Citation

@inproceedings{liu2026lsaprobe,
  title={LSA-Probe: Membership Inference via Latent Stability Analysis for Music Diffusion Models},
  author={Liu, Yuxuan and Zhang, Peizhuo and Sang, Ruiqi and Li, Zhiyong and Tan, Yan and Cai, Yi and Li, Sheng},
  booktitle={ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2026},
  organization={IEEE}
}