Security Conferences
-
BAIT: Large Language Model Backdoor Scanning by Inverting Attack Target
Guangyu Shen*, Siyuan Cheng*, Zhuo Zhang, Guanhong Tao, Kaiyuan Zhang, Hanxi Guo, Lu Yan, Xiaolong Jin, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of the 46th IEEE Symposium on Security and Privacy (S&P 2025)
-
CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Kaiyuan Zhang, Siyuan Cheng, Guangyu Shen, Bruno Ribeiro, Shengwei An, Pin-Yu Chen, Xiangyu Zhang, Ninghui Li
Proceedings of the 30th Network and Distributed System Security Symposium (NDSS 2025)
-
Exploring Inherent Backdoors in Deep Learning Models
Guanhong Tao, Siyuan Cheng, Zhenting Wang, Shiqing Ma, Shengwei An, Yingqi Liu, Guangyu Shen, Zhuo Zhang, Yunshu Mao, Xiangyu Zhang
Annual Computer Security Applications Conference (ACSAC 2024)
-
ODSCAN: Backdoor Scanning for Object Detection Models
Siyuan Cheng*, Guangyu Shen*, Guanhong Tao, Kaiyuan Zhang, Zhuo Zhang, Shengwei An, Xiangzhe Xu, Yingqi Liu, Shiqing Ma, Xiangyu Zhang
Proceedings of the 45th IEEE Symposiums on Security and Privacy (S&P 2024)
-
Exploring the Orthogonality and Linearity of Backdoor Attacks
Kaiyuan Zhang*, Siyuan Cheng*, Guangyu Shen, Guanhong Tao, Shengwei An, Anuran Makur, Shiqing Ma, Xiangyu Zhang
Proceedings of the 45th IEEE Symposiums on Security and Privacy (S&P 2024)
-
On Large Language Models’ Resilience to Coercive Interrogation
Zhuo Zhang, Guangyu Shen, Guanhong Tao, Siyuan Cheng, Xiangyu Zhang
Proceedings of the 45th IEEE Symposiums on Security and Privacy (S&P 2024)
-
Distribution Preserving Backdoor Attack in Self-supervised Learning
Guanhong Tao, Zhenting Wang, Shiwei Feng, Guangyu Shen, Shiqing Ma, and Xiangyu Zhang
Proceedings of the 45th IEEE Symposiums on Security and Privacy (S&P 2024)
-
Rethinking the Invisible Protection against Unauthorized Image Usage in Stable Diffusion
Shengwei An* , Lu Yan*, Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, Xiangyu Zhang
Proceedings of the 33rd USENIX Security Symposium (USENIX Security 2024)
-
BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Shengwei An, Xiangzhe Xu, Shiwei Feng, Guangyu Shen, Kaiyuan Zhang, Qiuling Xu, Shiqing Ma, Xiangyu Zhang
Proceedings of the 30th Network and Distributed System Security Symposium (NDSS 2023)
-
Hard-label Black-box Universal Adversarial Patch Attack
Guanhong Tao, Shengwei An, Siyuan Cheng, Guangyu Shen, Xiangyu Zhang
Proceedings of the 32nd USENIX Security Symposium (USENIX Security 2023)
-
PELICAN: Exploiting Backdoors of Naturally Trained Deep Learning Models in Binary Code Analysis
Zhuo Zhang, Guanhong Tao, Guangyu Shen, Shengwei An, Qiuling Xu, Yingqi Liu, Yapeng Ye, Yaoxuan Wu, Xiangyu Zhang
Proceedings of the 32nd USENIX Security Symposium (USENIX Security 2023)
-
ImU: Physical Impersonating Attack for Face Recognition System with Natural Style Changes
Shengwei An, Yuan Yao, Qiuling Xu, Shiqing Ma, Guanhong Tao, Siyuan Cheng, Kaiyuan Zhang, Yingqi Liu, Guangyu Shen, Ian Kelk, Xiangyu Zhang
Proceedings of the 44rd IEEE Symposiums on Security and Privacy (S&P 2023)
-
MIRROR: Model Inversion for Deep Learning Network with High Fidelity
Shengwei An, Guanhong Tao, Qiuling Xu, Yingqi Liu, Guangyu Shen, Yuan Yao, Jingwei Xu, Xiangyu Zhang
Proceedings of the 29th Network and Distributed System Security Symposium (NDSS 2022)
-
PICCOLO: Exposing Complex Backdoors in NLP Transformer Models
Yingqi Liu*, Guangyu Shen*, Guanhong Tao, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of the 43rd IEEE Symposiums on Security and Privacy (S&P 2022)
-
Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Security
Guanhong Tao, Yingqi Liu, Guangyu Shen, Qiuling Xu, Shengwei An, Zhuo Zhang, and Xiangyu Zhang
Proceedings of the 43rd IEEE Symposiums on Security and Privacy (S&P 2022)
AI/ML Conferences
-
AuthGuard: Generalizable Deepfake Detection via Language Guidance
Guangyu Shen, Zhihua Li, Xiang Xu, Tianchen Zhao, Zheng Zhang, Dongsheng An, Zhuowen Tu, Yifan Xing, Qin Zhang
Winter Conference on Applications of Computer Vision 2026 (WACV 2026)
-
Mitigating Backdoor Attacks via Trigger Reconstruction and Model Hardening
Guanhong Tao, Siyuan Cheng, Guangyu Shen, Yingqi Liu, Shengwei An, Zhuo Zhang, Zhenting Wang, Hanxi Guo, Xiangyu Zhang
Winter Conference on Applications of Computer Vision 2026 (WACV 2026)
-
JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models
Xiaolong Jin, Zixuan Weng, Hanxi Guo, Chenlong Yin, Siyuan Cheng, Guangyu Shen, Xiangyu Zhang
International Conference on Computer Vision (ICCV 2025)
-
Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis
Hanxi Guo, Siyuan Cheng, Xiaolong Jin, Zhuo Zhang, Guangyu Shen, Kaiyuan Zhang, Shengwei An, Guanhong Tao, Xiangyu Zhang
Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)
-
System Prompt Hijacking via Permutation Triggers in LLM Supply Chains
Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang
Proceedings of the Association for Computational Linguistics (ACL 2025)
-
UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
Siyuan Cheng*, Guangyu Shen*, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang
The 18th European Conference on Computer Vision (ECCV 2024)
-
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)
-
Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift
Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, Qiuling Xu, Guanhong Tao, Guangyu Shen, Siyuan Cheng, Shiqing Ma, Pin-Yu Chen, Tsung-Yi Ho, Xiangyu Zhang
Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)
-
Django: Detecting Trojans in Object Detection Models via Gaussian Focus Calibration
Guangyu Shen*, Siyuan Cheng*, Guanhong Tao, Kaiyuan Zhang, Yingqi Liu, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
-
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP
Lu Yan, Zhuo Zhang, Guanhong Tao, Kaiyuan Zhang, Xuan Cheng, Guangyu Shen, Xiangyu Zhang
Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)
-
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
Kaiyuan Zhang, Guanhong Tao, Qiuling Xu, Siyuan Cheng, Shengwei An, Yingqi Liu, Shiwei Feng, Guangyu Shen, Pin-Yu Chen, Shiqing Ma, Xiangyu Zhang
Proceedings of the Eleventh International Conference on Learning Representations (ICLR 2023)
ECCV 2022 Workshop on Adversarial Robustness in the Real World (AROW 2023) Best Paper Award
-
Detecting Backdoors in Pre-trained Encoders
Shiwei Feng, Guanhong Tao, Siyuan Cheng, Guangyu Shen, Xiangzhe Xu, Yingqi Liu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)
-
MEDIC: Remove Model Backdoors via Importance Driven Cloning
Qiuling Xu, Guanhong Tao, Jean Honorio, Yingqi Liu, Shengwei An, Guangyu Shen, Siyuan Cheng, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)
-
Better Trigger Inversion Optimization in Backdoor Scanning
Guanhong Tao, Guangyu Shen, Yingqi Liu, Shengwei An, Qiuling Xu, Shiqing Ma, Pan Li, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022 Oral)
-
Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense
Guangyu Shen*, Yingqi Liu*, Guanhong Tao, Qiuling Xu, Zhuo Zhang, Shengwei An, Shiqing Ma, Xiangyu Zhang
Proceedings of the 39th International Conference on Machine Learning (ICML 2022)
-
Complex Backdoor Detection by Symmetric Feature Differencing
Yingqi Liu*, Guangyu Shen*, Guanhong Tao, Zhenting Wang, Shiqing Ma, Xiangyu Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (CVPR 2022)
-
Backdoor Scanning for Deep Neural Networks through K-Arm Optimization
Guangyu Shen*, Yingqi Liu*, Guanhong Tao, Shengwei An, Qiuling Xu, Siyuan Cheng, Shiqing Ma, Xiangyu Zhang
Proceedings of Thirty-eighth International Conference on Machine Learning (ICML 2021)
Pre-prints / Workshops
-
ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants
Xiangzhe Xu*, Guangyu Shen*, Zian Su, Siyuan Cheng, Hanxi Guo, Lu Yan, Xuan Chen, Jiasheng Jiang, Xiaolong Jin, Chengpeng Wang, Zhuo Zhang, Xiangyu Zhang
NeurIPS 2025 Workshop on Socially Responsible and Trustworthy Foundation Models (ResponsibleFM 2025)
🏆 Winning red-teaming solution in Amazon Nova AI Challenge
-
From Poisoned to Aware: Fostering Backdoor Self-Awareness in LLMs
Guangyu Shen, Siyuan Cheng, Xiangzhe Xu, Yuan Zhou, Hanxi Guo, Zhuo Zhang, Xiangyu Zhang
NeurIPS 2025 Workshop on Socially Responsible and Trustworthy Foundation Models (ResponsibleFM 2025)
-
Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia
Guangyu Shen*, Siyuan Cheng* Kaiyuan Zhang, Lu Yan, Shengwei An, Zhuo Zhang, Guanhong Tao, Shiqing Ma, Xiangyu Zhang
NeurIPS 2025 Workshop on Socially Responsible and Trustworthy Foundation Models (ResponsibleFM 2025)
-
A Systematic Threat Modeling of LLM Applications
Guanhong Tao, Siyuan Cheng, Zhuo Zhang, Junmin Zhu, Guangyu Shen, Wanjing Han, Mu Zhang, Xiangyu Zhang
FSE 2025 Workshop on LLM App Store Analysis (LLMapp 2025)
-
CodeMirage: A Multi-Lingual Benchmark for Detecting AI-Generated and Paraphrased Source Code from Production-Level LLMs
Hanxi Guo, Siyuan Cheng, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang
NeurIPS 2025 Workshop on Deep Learning for Code (DL4C 2025)
-
SkewAct: Red Teaming Large Language Models via Activation-Skewed Adversarial Prompt Optimization
Hanxi Guo, Siyuan Cheng, Guanhong Tao, Guangyu Shen, Zhuo Zhang, Shengwei An, Kaiyuan Zhang, Xiangyu Zhang
NeurIPS 2024 Workshop on Red Teaming GenAI (RedTeaming 2024)
-
MultiVerse: Exposing Large Language Model Alignment Problems in Diverse Worlds
Xiaolong Jin, Zhuo Zhang, Guangyu Shen, Hanxi Guo, Kaiyuan Zhang, Siyuan Cheng, Xiangyu Zhang
NeurIPS 2024 Workshop on Safe Generative AI (SafeGenAI 2024)
-
D3: Detoxing Deep Learning Dataset
Lu Yan, Siyuan Cheng, Guangyu Shen, Guanhong Tao, Kaiyuan Zhang, Yunshu Mao, Xiangyu Zhang
NeurIPS 2023 Workshop on Backdoors in Deep Learning - The Good, the Bad, and the Ugly (Backdoors 2023)
-
Hardening Modern Pre-trained NLP Models Against Backdoors
Guangyu Shen*, Yingqi Liu*, Guanhong Tao, Zhuo Zhang, Qiuling Xu, Shengwei An, Shiqing Ma, Xiangyu Zhang