Publications
publications by categories in reversed chronological order.
2025
- BadJudge: Backdoor Vulnerabilities of LLM-As-A-JudgeIn The Thirteenth International Conference on Learning Representations. More Information can be found here , 2025
- Unraveling Indirect In-Context Learning Using Influence Functions2025
2024
- Mitigating Backdoor Threats to Large Language Models: Advancement and ChallengesIn 2024 60th Annual Allerton Conference on Communication, Control, and Computing, Nov 2024