Artificial Intelligence
-
Language models recognize dropout and Gaussian noise applied to their activations
Damiano Fornasiere●, Mirko Bronzi●, Spencer Kitts●, Alessandro Palmas, Yoshua Bengio, Oliver Richardson. χ2604.17465
-
Can a Bayesian oracle prevent harm from an agent?
Yoshua Bengio, Michael K. Cohen, Nikolay Malkin, Matt MacDermott, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar. UAI 2025
Mathematics
Position
-
The Scientist AI: safe by design, by not desiring
Damiano Fornasiere●, Oliver Richardson●, Gaël Gendron, Iulian Serban, Yoshua Bengio. LawZero Research. 2026
-
Chain-of-thought is not explainability
Fazl Barez, Tung-Yu Wu, Iván Arcuschin, Michael Lan, Vincent Wang, Noah Siegel, Nicolas Collignon, Clement Neo, Isabelle Lee, Alasdair Paren, Adel Bibi, Robert Trager, Damiano Fornasiere, John Yan, Yanai Elazar, Yoshua Bengio. Preprint. 2025
-
Can Scientist AI offer a safer path?
Yoshua Bengio, Michael Cohen, Damiano Fornasiere, Joumana Ghosn, Pietro Greiner, Matt MacDermott, Sören Mindermann, Adam Oberman, Jesse Richardson, Oliver Richardson, Marc-Antoine Rondeau, Pierre-Luc St-Charles, David Williams-King. χ2502.15657
Preprints
-
How much is left? LLMs linearly encode their remaining output length
Mohamed Amine Merzouk, Dmitri Carpov, Mirko Bronzi, Damiano Fornasiere○, Adam Oberman○.
-
Reasoning and learning about injected concepts in language models
Samarth Bhargav, Damiano Fornasiere○, Mirko Bronzi○.
-
Evaluation awareness in language models: representation, verbalization, and control
Farzaneh Heidari, Amin Memarian, Guillaume Rabusseau, Aton Kamanda○, Pietro Greiner○, Damiano Fornasiere○.
-
PyINE: a framework for scalable elicitation and oversight via code execution
Pierre-Luc St-Charles, Alessandro Palmas, Damiano Fornasiere, Storm Lei, Mirko Bronzi, Jean-Pierre R. Falet, Iulian Serban, Yoshua Bengio.
-
Efficient safety alignment of language models via latent personality traits
Amine Merzouk, Nolan Smyth, Damiano Fornasiere, Linh Le, David Williams-King, Adam Oberman.
-
Bayesian symbolic regression with entropic reinforcement learning
Oussama Boussif, Mohammed Mahfoud, Younesse Kaddar, Moksh Jain, Sida Li, Damiano Fornasiere, Xiaoyin Chen, Yoshua Bengio, Nikolay Malkin.