Posts by Collection

portfolio

publications

PANDAGUARD: Systematic Evaluation of LLM Safety against Jailbreaking Attacks

Published in arxiv, 2025

We introduce PandaGuard and PandaBench, a unified, reproducible framework and benchmark for systematically evaluating LLM jailbreak attacks, defenses, and judges, revealing that no single defense is universally optimal and that judge disagreement significantly affects safety assessments.

Recommended citation: Shen, G., Zhao, D., Feng, L., He, X., Wang, J., Shen, S., ... & Zeng, Y. (2025). PANDAGUARD: Systematic Evaluation of LLM Safety against Jailbreaking Attacks.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.