I gave an online talk titled “Mean-field Models for Self-attention Dynamics in Transformers” at the BIRS workshop Interacting Particle Systems: Theoretical Innovations and Practical Applications in Hangzhou, China. My talk was based on the following paper:
I gave a talk at the Workshop on the Mathematics of Transformers, hosted at Julius Maximilians University of Würzburg. You can find more details on the event here and my talk was based on the following paper: