Publications
publications by categories in reversed chronological order.
2025
-
BadJudge: Backdoor Vulnerabilities of LLM-As-A-JudgeIn The Thirteenth International Conference on Learning Representations. More Information can be found here , 2025 - Unraveling Indirect In-Context Learning Using Influence Functions2025
2024
-
Mitigating Backdoor Threats to Large Language Models: Advancement and ChallengesIn 2024 60th Annual Allerton Conference on Communication, Control, and Computing, Nov 2024