Accueil
Équipe
Publications
Open source
Démos
Évènements
Blog
Carrières
Nous joindre
Français
Français
English
ServiceNow
ServiceNow IA recherche
Tags
Cybersecurity
ServiceNow IA recherche
Cybersecurity
No, of Course I Can! Deeper Fine-Tuning Attacks That Bypass Token-Level Safety Mechanisms
Leading language model (LM) providers like OpenAI and Anthropic allow customers to fine-tune frontier LMs for specific use cases. To …
Joshua Kazdan
,
Abhay Puri
,
Rylan Schaeffer
,
Lisa Yu
,
Chris Cundy
,
Jason Stanley
,
Sanmi Koyejo
,
Krishnamurthy (Dj) Dvijotham
International Conference on Learning Representations, 2026.
PDF
Citation
Silent Sabotage: Injecting Backdoors into AI Agents Through Fine-Tuning
The rise of AI agents that can use tools, browse the web and interact with computers on behalf of a user, has sparked strong interest …
Léo Boisvert
,
Abhay Puri
,
Chandra Kiran Reddy Evuru
,
Joshua Kazdan
,
Avinandan Bose
,
Quentin Cappart
,
Maryam Fazel
,
Sai Rajeswar Mudumba
,
Jason Stanley
,
Nicolas Chapados
,
Alexandre Drouin
,
Krishnamurthy (Dj) Dvijotham
Workshop at the International Conference of Machine Learning (ICML), 2025.
PDF
Citation
Citation
×