Stay tuned! We are continuously updating this list
Research Group Leader ELLIS Institute & Max-Planck Institute for Intelligent Systems
Reasoning capabilities represent a critical frontier for large language models (LLMs), but developing them requires extensive proprietary datasets and computational resources. One solution is to use model merging, which offers an alternative by combining the weights, and hence capabilities of multiple models without retraining. However, merging relies on manually-designed strategies for merging hyperparameters, limiting the exploration of potential model combinations and requiring significant human effort. This sounds like an autoML problem - and it is. In this talk I will discuss this problem setting and related applications and a recent publication where we found that we can find good solutions for this application with standard autoML tools adapted to this domain.
Bio. Jonas is a ML researcher in Tübingen, Germany, where he leads the research group for safety- & efficiency- aligned learning. Before this, I’ve spent time at the Universities of Maryland, Siegen and Münster. When it comes to efficient learning, he studies how we can build systems that do more with less, from weight averaging techniques to recursive computation approaches that extend model capabilities. with a particular interest in how these systems reason, and whether we can enhance their reasoning abilities while maintaining efficiency. How do we build mechanisms that let these models learn to be intelligent systems?