Latest News
Bigger machine learning models, better physics
Fundamental Forces and Cosmic Evolution, Accelerated Scientific Discovery, ORIGINS, Research, Physics |
Scientists working on the ATLAS experiment at CERN have shown that transformer models, which are the same family of AI architectures used in today’s large language models, perform exceptionally well at identifying different types of heavy particles produced in proton collisions. The findings confirm that high-energy physics benefits from the same scaling laws driving progress in AI more broadly: larger models trained on more data and more compute deliver systematically better performance.
A new ATLAS study published January 30, 2026, “Carpe Datum: Scaling behavior of transformers for heavy hadron flavor identification,” pushes these questions to unprecedented scale. The collaboration investigated how transformer-based flavor-tagging models behave as both model capacity and dataset size increase, an approach informed by recent advances in AI research. The results, produced with the contributions of Dr. Nicole Hartman and doctoral student Matthias Vigl from the Technical University of Munich’s School of Natural Sciences (NAT), provide a clear answer: performance continues to improve with scale, with no sign of saturation yet.
Prof. Lukas Heinrich (NAT) explains that transformer architectures are uniquely suited to interpreting the complex, high-dimensional signals produced by the ATLAS detector. By capturing subtle correlations within particle collision data, these models excel at flavor tagging, the process of determining what type of quark initiated a particle jet. This capability is essential for precision studies of the Standard Model, including analyses involving the Higgs boson and searches for new physics.
The EP News article emphasizes that flavor tagging sits at the intersection of jet physics and advanced machine learning: jets are among the most abundant objects produced in LHC collisions, and distinguishing between quark types (e.g., bottom, charm, or light quarks) requires sophisticated pattern recognition tools. Transformer-based approaches allow ATLAS to move beyond traditional hand engineered algorithms and toward unified, end-to-end machine learning models that directly process low-level detector information.
As CERN prepares for the High Luminosity Large Hadron Collider (HL-LHC), which will deliver orders of magnitude more collision data, these scaling-aware AI techniques will become indispensable. Larger datasets and higher event rates will demand models that can keep pace with the growing complexity of the detector environment. Heinrich notes that the ongoing work within ATLAS to integrate advanced AI models into reconstruction and data processing workflows is therefore both strategic and necessary.
Overall, this new study highlights how cutting-edge AI and high-energy physics increasingly reinforce one another. By demonstrating the scaling behavior of transformer-based flavor tagging models, ATLAS is charting a path toward more powerful, versatile analysis tools that will expand the scientific reach of the LHC in the years ahead.
More information and links
- ATLAS at CERN https://atlas.cern/
- ORIGINS Cluster of Excellence https://www.origins-cluster.de/en/
- ATLAS scales up AI for jet physics and reveals flavour-tagging scaling laws. EP Newsletter. https://ep-news.web.cern.ch/content/atlas-scales-ai-jet-physics-and-reveals-flavour-tagging-scaling-laws
Scientific articles
- The ATLAS Collaboration. Carpe Datum: Scaling behavior of transformers for heavy hadron flavor identification. ATLAS PUB Note https://cds.cern.ch/record/2953659/files/ATL-SOFT-PUB-2026-002.pdf
- ATLAS Collaboration, Transforming jet flavour tagging at ATLAS. Nature Communications. doi: 10.1038/s41467-025-65059-6
Contact for this article
Prof. Lukas Heinrich
Data Science in Physics Laboratory
https://www.decodingnature.com/
l.heinrich@tum.de
+49 89 35831-7141
Press contact
communications@nat.tum.de
Team website