Professor Rubén Sánchez-García has published “TomoCPT: a generalizable model for 3D particle detection and localization in cryo-electron tomograms”. This paper is the result of a research project in collaboration with the University of Oxford and IE University.
Cryo-electron tomography is a rapidly developing field for studying macromolecular complexes in their native environments and has the potential to revolutionize our understanding of protein function. However, fast and accurate identification of particles in cryo-tomograms is challenging and represents a significant bottleneck in downstream processes such as subtomogram averaging. In his paper, professor Sánchez-García presented TomoCPT (Tomogram Centroid Prediction Tool), a transformer-based solution that reformulates particle detection as a centroid-prediction task using Gaussian labels. Their approach, which is built upon the SwinUNETR architecture, demonstrates superior performance compared with both conventional binary labelling strategies and template matching. The paper shows that tomoCPT effectively generalizes to novel particle types through zero-shot inference and can be significantly enhanced through fine-tuning with limited data.
The efficacy of TomoCPT was validated using three case studies: apoferritin, achieving a resolution of 3.0 Å compared with 3.3 Å using template matching, SARS-CoV-2 spike proteins on cell surfaces, yielding an 18.3 Å resolution map where template matching proved unsuccessful, and rubisco molecules within carboxysomes, reaching 8.0 Å resolution. These results demonstrate the ability of TomoCPT to handle varied scenarios, including densely packed environments and membrane-bound proteins. The implementation of the tool as a command-line program, coupled with its minimal data requirements for fine-tuning, makes it a practical solution for high-throughput cryo-ET data-processing workflows.
Rubén Sánchez-García is an assistant professor at IE SciTech School. He is also a member of IE Research Datalab, specializing in Data Science and AI, Computational Biology, and Structural Biology.