This project provides different methods for auditing Large Language Models (LLMs) to verify service integrity.
📄 Paper: Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Adapted from LLM Idiosyncrasies with added model support.
-
Generate responses from all models mentioned in paper:
./run_all.sh
-
Train binary classifiers between original and quantized models. This example uses LLM2Vec, but other embedding models can be easily added:
./classify.sh
-
Analyze classification results in the
./classification
.
-
Generate identity responses for multiple models:
./run_generation.sh
-
Analyze identity occurences using the example shown in the notebook:
jupyter notebook count_names.ipynb
Adapted from Model Equality Testing with randomized model substitution and relevant experiments.
-
Download the necessary datasets that includes selected LLMs generation on wikipedia dataset:
cd model_equality_testing python download.py
-
Run model equality testing on mixed distribution with different probability of substitution, parameters for the experiments can also be changed here:
./mixed.sh
-
Results are saved in model-specific directories with summaries showing statistical power of the test.
Adapted from LM Evaluation Harness with added temperature support token loglikelihood request.
-
Set up the benchmark environment:
cd benchmark pip install -e .
-
Run different benchmarks at different temperatures:
./run_benchmarks.sh
-
MMLU requires an extra step, resampling MMLU results for Monte Carlo estimation:
python resample_mmlu.py --dir "/path/to/model/mmlu_results" --samples 100
- Collect logprobs from models. Use
pip install
to specify different versions ofvllm
andtransformers
to vary the software environment:python logprobs/run_logprobs.py --model "meta-llama/Meta-Llama-3-8B-Instruct"
If you find this work useful for your research, please cite our paper:
@article{cai2025gettingpayforauditing,
title={Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs},
author={Cai, Will and Shi, Tianneng and Zhao, Xuandong and Song, Dawn},
journal={arXiv preprint arXiv:2504.04715},
year={2025},
}
Note: As of April 9, 2025, we are still rerunning experiments to ensure everything works as expected. We will make necessary changes and provide more details about the environment soon.