TY - JOUR
T1 - Language model-guided anticipation and discovery of mammalian metabolites
AU - Qiang, Hantao
AU - Wang, Fei
AU - Lu, Wenyun
AU - Xing, Xi
AU - Kim, Hahn
AU - Mérette, Sandrine A.M.
AU - Ayres, Lucas B.
AU - Oler, Eponine
AU - AbuSalim, Jenna E.
AU - Roichman, Asael
AU - Neinast, Michael
AU - Cordova, Ricardo A.
AU - Lee, Won Dong
AU - Herbst, Ehud
AU - Gupta, Vishu
AU - Neff, Samuel L.
AU - Hiebert-Giesbrecht, Mickel
AU - Young, Adamo
AU - Gautam, Vasuk
AU - Tian, Siyang
AU - Wang, Bo
AU - Röst, Hannes
AU - Baidwan, Jatinder
AU - Greiner, Russell
AU - Chen, Li
AU - Johnston, Chad W.
AU - Foster, Leonard J.
AU - Shapiro, Aaron M.
AU - Wishart, David S.
AU - Rabinowitz, Joshua D.
AU - Skinnider, Michael A.
N1 - Publisher Copyright:
© The Author(s) 2026.
PY - 2026/1/14
Y1 - 2026/1/14
N2 - Despite decades of study, large parts of the mammalian metabolome remain unexplored1. Mass spectrometry-based metabolomics routinely detects thousands of small molecule-associated peaks in human tissues and biofluids, but typically only a small fraction of these can be identified, and structure elucidation of novel metabolites remains challenging2, 3–4. Biochemical language models have transformed the interpretation of DNA, RNA and protein sequences, but have not yet had a comparable impact on understanding small molecule metabolism. Here we present an approach that leverages chemical language models5, 6–7 to anticipate the existence of previously uncharacterized metabolites. We introduce DeepMet, a chemical language model that learns from the structures of known metabolites to anticipate the existence of previously unrecognized metabolites. Integration of DeepMet with mass spectrometry-based metabolomics data facilitates metabolite discovery. We harness DeepMet to reveal several dozen structurally diverse mammalian metabolites. Our work demonstrates the potential for language models to advance the mapping of the mammalian metabolome.
AB - Despite decades of study, large parts of the mammalian metabolome remain unexplored1. Mass spectrometry-based metabolomics routinely detects thousands of small molecule-associated peaks in human tissues and biofluids, but typically only a small fraction of these can be identified, and structure elucidation of novel metabolites remains challenging2, 3–4. Biochemical language models have transformed the interpretation of DNA, RNA and protein sequences, but have not yet had a comparable impact on understanding small molecule metabolism. Here we present an approach that leverages chemical language models5, 6–7 to anticipate the existence of previously uncharacterized metabolites. We introduce DeepMet, a chemical language model that learns from the structures of known metabolites to anticipate the existence of previously unrecognized metabolites. Integration of DeepMet with mass spectrometry-based metabolomics data facilitates metabolite discovery. We harness DeepMet to reveal several dozen structurally diverse mammalian metabolites. Our work demonstrates the potential for language models to advance the mapping of the mammalian metabolome.
UR - https://www.scopus.com/pages/publications/105027538755
U2 - 10.1038/s41586-025-09969-x
DO - 10.1038/s41586-025-09969-x
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 41535467
AN - SCOPUS:105027538755
SN - 0028-0836
JO - Nature
JF - Nature
ER -