EPFL @ NeurIPS 2025

NeurIPS offical logo© 2025

NeurIPS offical logo© 2025

53 EPFL papers have been accepted to this year conference on Neural Information Processing Systems (NeurIPS). Congratulatations!
39th edition of NeurIPS will take place in San Diego, USA, from December 2nd to December 7th and Mexico City, Mexico, from November 30th to December 5th.

Below, a list of NeurIPS 2025 accepted papers with at least one EPFL author:

  1. Generalized Gradient Norm Clipping & Non-Euclidean (L_0,L_1)-Smoothness by Thomas Pethick, Wanyun Xie, Mete Erdogan, Kimon Antonakopoulos, Tony Silveti-Falls, Volkan Cevher (oral)
  2. Efficient Large Language Model Inference with Neural Block Linearization by Mete Erdogan, Francesco Tonin, Volkan Cevher
  3. Robustness in Both Domains: CLIP Needs a Robust Text Encoder by Elias Abad Rocamora, Christian Schlarmann, Naman Deep Singh, Yongtao Wu, Matthias Hein, Volkan Cevher
  4. Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning by Till Freihaut, Luca Viano, Volkan Cevher, Matthieu Geist, Giorgia Ramponi
  5. Ascent Fails to Forget by Ioannis Mavrothalassitis, Pol Puigdemont, Noam Itzhak Levi, Volkan Cevher
  6. Linear Attention for Efficient Bidirectional Sequence Modeling by Arshia Afzal, Elias Abad Rocamora, Leyla Naz Candogan, Pol Puigdemont, Francesco Tonin, Yongtao Wu, Mahsa Shoaran, Volkan Cevher
  7. The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks by Vittorio Erba, Emanuele Troiani, Lenka Zdeborová, Florent Krzakala
  8. Bayes optimal learning of attention-indexed models by Fabrizio Boncoraglio, Emanuele Troiani, Vittorio Erba, Lenka Zdeborová
  9. VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection by Wuyang Li, Zhu Yu, Alexandre Alahi (spotlight)
  10. GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining by Simin Fan, Maria Ios Glarou, Martin Jaggi
  11. With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You by Fabian Gröger*, Shuo Wen*, Huyen Le, Maria Brbić
  12. Weak-to-Strong Generalization under Distribution Shifts by Myeongho Jeon, Jan Sobotka, Suhwan Choi, Maria Brbić
  13. URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training by Dongyang Fan, Vinko Sabolčec, Martin Jaggi
  14. MEIcoder: Decoding Visual Stimuli from Neural Activity by Leveraging Most Exciting Inputs by Jan Sobotka, Luca Baroni, Ján Antolík
  15. AugGen: Synthetic Augmentation Can Improve Discriminative Models by Parsa Rahimi, Damien Teney, Sebastien Marcel
  16. Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning by Julian Minder*, Clément Dumas*, Caden Juang, Bilal Chughtai, Neel Nanda
  17. The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? by Denis Sutter, Julian Minder, Thomas Hoffman, Tiago Pimentel (spotlight)
  18. zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression by Saibo Geng*, Nathan Ranchin*, Yunzhen Yao, Maxime Peyrard, Chris Wendler, Michael Gastpar, Robert West
  19. TokenSwap: A Lightweight Method to Disrupt Memorized Sequences in LLMs by Parjanya Prashant*, Kaustubh Ponkshe*, Babak Salimi (spotlight)
  20. One-Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models by Viacheslav Surkov, Chris Wendler, Antonio Mari, Mikhail Terekhov, Justin Deschenaux, Robert West, Caglar Gulcehre, David Bau
  21. High Resolution UDF Meshing via Iterative Networks by Federico Stella, Nicolas Talabot, Hieu Le, Pascal Fua
  22. Fully FP8 GEMM LLM Training at Scale by Alejandro Hernández-Cano, Dhia Garbaya, Imanol Schlag, Martin Jaggi
  23. FlashMD: long-stride, universal prediction of molecular dynamics by Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi, Michele Ceriotti (spotlight)
  24. Flow-Based Non-stationary Temporal Regime Causal Structure Learning by Abdellah Rahmani, Pascal Frossard
  25. Which Algorithms have Tight Generalization Bounds? by Michael Gastpar, Ido Nachum, Jonathan Shafer, Thomas Weinberger (spotlight)
  26. Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks by Yixuan Xu, Antoine Bosselut, Imanol Schlag
  27. For Better or for Worse, Transformers Seek Patterns for Memorization by Madhur Panwar, Gail Weiss, Navin Goyal, Antoine Bosselut
  28. Measuring what Matters: Construct Validity in Large Language Model Benchmarks by Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou, Franziska Sofia Hafner, Harry Mayne, Jan Batzner, Negar Foroutan, Chris Schmitz, Karolina Korgul, Hunar Batra, Oishi Deb, Emma Beharry, Cornelius Emde, Thomas Foster, Anna Gausen, María Grandury, Simeng Han, Valentin Hofmann, Lujain Ibrahim, Hazel Kim, Hannah Rose Kirk, Fangru Lin, Gabrielle Kaili-May Liu, Lennart Luettgau, Jabez Magomere, Jonathan Rystrøm, Anna Sotnikova, Yushi Yang, Yilun Zhao, Adel Bibi, Antoine Bosselut, Ronald Clark, Arman Cohan, Jakob Nicolaus Foerster, Yarin Gal, Scott A. Hale, Inioluwa Deborah Raji, Christopher Summerfield, Philip Torr, Cozmin Ududec, Luc Rocher, Adam Mahdi
  29. Optimal Graph Clustering without Edge Density Signals by Maximilien Dreveton, Elaine S. Liu, Matthias Grossglauser, Patrick Thiran
  30. EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models by Andy Bonnetto, Haozhe Qi, Franklin Leong, Matea Tashkovska, Mahdi Rad, Solaiman Shokur, Friedhelm Hummel, Silvestro Micera, Marc Pollefeys, Alexander Mathis
  31. Optimal Best Arm Identification under Differential Privacy by Marc Jourdan, Achraf Azize
  32. OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents by Thomas Kuntz, Agatha Duzan, Hao Zhao, Francesco Croce, Zico Kolter, Nicolas Flammarion, Maksym Andriushchenko (spotlight)
  33. Enhancing Multilingual LLM Pretraining with Model-Based Data Selection by Bettina Messmer, Vinko Sabolčec, Martin Jaggi
  34. Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions by Simon Matrenok*, Skander Moalla*, Caglar Gulcehre
  35. RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling by Xiuying Wei, Anunay Yadav, Razvan Pascanu, Caglar Gulcehre
  36. Flat Channels to Infinity in Neural Loss Landscapes by Flavio Martinelli*, Alexander van Meegen*, Berfin Simsek, Wulfram Gerstner, Johanni Brea
  37. Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Network by Ann Huang, Satpreet Harcharan Singh, Flavio Martinelli, Kanaka Rajan (spotlight)
  38. What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains by Chanakya Ekbote, Marco Bondaschi, Nived Rajaraman, Jason D. Lee, Paul Pu Liang, Michael Gastpar, Ashok Vardhan Makkuva (spotlight)
  39. Online Two-Stage Submodular Maximization by Iasonas Nikolaou, Miltiadis Stouras, Stratis Ioannidis, Evimaria Terzi
  40. The emergence of sparse attention: impact of data distribution and benefits of repetition by Nicolas Zucchet, Francesco D’Angelo, Andrew Lampinen, Stephanie Chan (oral)
  41. Computational Efficiency under Covariate Shift in Kernel Ridge Regression by Andrea Della Vecchia, Arnaud Mavakala Watusadisi, Ernesto De Vito, Lorenzo Rosasco (spotlight)
  42. Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks by Luca Arnaboldi, Bruno Loureiro, Ludovic Stephan, Florent Krzakala, Lenka Zdeborová
  43. The Computational Advantage of Depth in Learning High-Dimensional Hierarchical Targets by Yatin Dandi, Luca Pesce, Lenka Zdeborová, Florent Krzakala (spotlight)
  44. Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions by Yizhou Xu, Florent Krzakala, Lenka Zdeborová
  45. OSKAR: Omnimodal Self-supervised Knowledge Abstraction and Representation by Mohamed O Abdelfattah*, Kaouther Messaoud*, Alexandre Alahi
  46. High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model by Valentin Schmutz, ProfileAli Haydaroglu, Shuqi Wang, Yixiao Feng, Matteo Carandini, Kenneth D. Harris (oral)
  47. Streaming Attention Approximation via Discrepancy Theory by Insu Han, Michael Kapralov, Ekaterina Kochetkova, Kshiteej Sheth, Amir Zandieh (spotlight)
  48. In Search of Lost Language Models Training Dynamics by Zhenting Qi, Fan Nie, Alexandre Alahi, James Zou, Himabindu Lakkaraju, Yilun Du, Eric Xing, Sham Kakade, Hanlin Zhang (oral)
  49. Inductive Domain Transfer In Misspecified Simulation-Based Inference by Ortal Senouf, Cédric Vincent-Cuaz, Emmanuel Abbé, Pascal Frossard
  50. Optimal Spectral Transitions in High-Dimensional Multi-Index Models by Leonardo Defilippis, Yatin Dandi, Pierre Mergny, Florent Krzakala, Bruno Loureiro
  51. Chain-of-Model Learning for Language Model by Kaitao Song, Xiaohua Wang, Xu Tan, Huiqiang Jiang, Chengruidong Zhang, Yongliang Shen, Cen Lu, Zihao Li, Zifan Song, Caihua Shan, Yansen Wang, Kan Ren, Xiaoqing Zheng, Tao Qin, Yuqing Yang, Dongsheng Li, Lili Qiu
  52. Latent Space Factorization in LoRA by Shashi Kumar, Yacouba Kaloga, John Mitros, Petr Motlicek, Ina Kodrasi
  53. Return of ChebNet: Understanding and Improving an Overlooked GNN on Long Range Tasks by Ali Hariri, Álvaro Arroyo, Alessio Gravina, Moshe Eliasof, Carola-Bibiane Schönlieb, Davide Bacciu, Kamyar Azizzadenesheli, Xiaowen Dong, Pierre Vandergheynst (spotlight)

∗Shared first authorship and equal contributions.