1MIx Group, University of Birmingham, 2University of Science and Technology of China
The Journal of Physical Chemistry A
Infrared (IR) spectroscopy, a type of vibrational spectroscopy, provides extensive molecular structure details and is a highly effective technique for chemists to determine molecular structures. However, analysing experimental spectra has always been challenging due to the specialised knowledge required and the variability of spectra under different experimental conditions.
Here, we propose a Transformer-based model with a patch-based self-attention spectrum embedding layer, designed to prevent the loss of spectral information while maintaining simplicity and effectiveness. To further enhance the model’s understanding of IR spectra, we introduce a data augmentation approach, which selectively introduces vertical noise only at absorption peaks.
Our approach not only achieves state-of-the-art performance on simulated datasets but also attains a top-1 accuracy of 55% on real experimental spectra, surpassing the previous state-of-the-art by approximately 10%. Additionally, our model demonstrates proficiency in analysing intricate and variable fingerprint regions, effectively extracting critical structural information.
@article{wu2025transformer, title={Transformer-Based Models for Predicting Molecular Structures from Infrared Spectra Using Patch-Based Self-Attention}, author={Wu, Wenjin and Leonardis, Aleš and Jiao, Jianbo and Jiang, Jun and Chen, Linjiang}, journal={The Journal of Physical Chemistry A}, volume = {129}, number = {8}, pages = {2077-2085}, year={2025}, doi = {10.1021/acs.jpca.4c05665}, publisher={ACS Publications} }