Biomedical Image Computing

Students can send new ideas and suggestions for possible Semester- or Master projects to the following address:



Abstract

This project bridges spectral data and molecular structure identification using a chemistry-informed machine learning framework. Focusing initially on Nuclear Magnetic Resonance (NMR) data, we aim to develop models that extract and interpret essential chemical patterns, enabling accurate predictions of functional groups and molecular structures. By integrating additional spectral modalities such as mass spectrometry (MS) and infrared (IR) in the long term, and leveraging advancements in deep learning and transformer models, this research seeks to redefine automated spectral interpretation, providing efficient, data-driven insights for fields like drug discovery and metabolomics.


Introduction


Understanding molecular structures is pivotal for advancements in drug discovery, materials science, and biochemistry. NMR spectroscopy is a cornerstone technique for providing detailed structural and functional information about molecules. However, manual interpretation of NMR spectra is labor-intensive and demands significant expertise [1]. This project explores the potential of machine learning to automate and enhance NMR data interpretation. Utilizing advanced deep learning architectures such as Transformers [4] and Graph Neural Networks (GNNs), and embedding chemical principles into the models, we aim to improve the accuracy and interpretability of spectral analysis [5]. Ultimately, the goal is to develop a robust, adaptable framework that can scale to multimodal spectral data, addressing the critical need for automated and scalable solutions in molecular analysis.


Project Objectives

  • Develop a chemistry-informed ML model to predict molecular structures from NMR data, focusing on functional group identification [6], [7].
  • Design a flexible framework integrating Transformers, GNNs, and generative models to handle various chemical representations [8], [9].
  • Explore multimodal integration by incorporating MS and IR spectral data to enhance prediction accuracy [10], [11].
  • Embed chemical relationships and spectroscopy-specific constraints into model architectures and loss functions to improve interpretability and reliability [12].


Methodology


Our approach combines domain knowledge from chemistry and spectroscopy with flexible machine learning models:
 

Dataset

We will utilize a newly introduced multimodal dataset comprising NMR, MS, and IR spectroscopy data to train models for accurate molecular structure prediction [10]. Initially focusing on NMR data, we will leverage its rich structural information to develop robust ML models. In subsequent phases, we will incorporate MS and IR data, expanding the model’s learning capacity across different spectral modalities and enhancing the robustness and accuracy of structure elucidation [13].
 

Model Development

  • Transformers: Utilize self-attention mechanisms to capture dependencies in spectral data, modeling complex relationships where feature significance depends on context [4].
  • Graph Neural Networks (GNNs): Represent molecular connectivity, inferring functional groups and structural motifs through node and edge relationships mapping to chemical bonds and molecular geometry [13].
  • Generative Models: Employ diffusion models to generate candidate structures from spectral data, addressing inverse problems where multiple molecular structures correspond to a given spectral pattern [8].

 

Chemistry-Informed Model Design

Embedding chemical relationships and spectroscopy principles into the model's architecture, loss functions, and data representations is crucial. Models will interpret spectral data according to chemical shift ranges, coupling patterns, and functional group signatures [6]. Custom loss functions will penalize predictions deviating from established chemical norms, enhancing both interpretability and predictive accuracy [14].

 

Expected Outcomes

  • A high-accuracy, chemistry-informed ML model for molecular structure prediction based on spectral data.
  • A flexible framework integrating various ML architectures, adaptable to diverse chemical challenges.
  • Enhanced prediction accuracy through multimodal data integration (NMR, MS, IR).
  • A roadmap for extending the framework to automate complex molecular analyses across chemical and life sciences.

 

Timeline

  • Month 1: Conduct literature review and familiarize with the dataset.
  • Month 2: Experiment with different ML architectures and integrate chemistry-informed design principles.
  • Months 3-4: Train models and refine iteratively, focusing on NMR data.
  • Month 5: Explore integration of MS and IR data, time permitting.
  • Month 6: Evaluate model performance, complete thesis, and prepare for defense.

 

Conclusion

This master thesis establishes a model-agnostic, chemistry-informed machine learning framework for spectral data analysis, starting with NMR and expanding to multimodal data. By embedding essential chemical knowledge and utilizing flexible ML architectures, this research revolutionizes automated molecular structure analysis, offering efficient and accurate insights. Building on advancements in transformer and generative models [4], [15], and integrating multimodal spectral data [10], [16], the project significantly enhances automated chemical analysis in drug discovery, metabolomics, and materials science.

 

schmid
Figure 1: Illustration of molecular structure representation through spin systems and spectral data.


Supervisor: Nicolas Schmid,

Professor:
Ender Konukoglu, ETF C107,

References

  1. Alberts, Marvin, et al. "Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry." arXiv, 2024. external page http://arxiv.org/abs/2407.17492.
  2. Marcarino, Maribel O., et al. "NMR Calculations with Quantum Methods: Development of New Tools for Structural Elucidation and Beyond." Accounts of Chemical Research, vol. 53, no. 9, 2020, pp. 1922-1932. external page https://doi.org/10.1021/acs.accounts.0c00365.
  3. Schmid, N., et al. "Deconvolution of 1D NMR spectra: A deep learning-based approach." Journal of Magnetic Resonance, vol. 347, 2023, p. 107357. external page https://doi.org/10.1016/j.jmr.2022.107357.
  4. Alberts, Marvin, et al. "Learning the Language of NMR: Structure Elucidation from NMR spectra using Transformer Models." ChemRxiv, 2023. external page https://chemrxiv.org/engage/chemrxiv/article-details/64d5e4ccdfabaf06ff1763ef.
  5. Schilter, Oliver, et al. "Unveiling the Secrets of 1H-NMR Spectroscopy: A Novel Approach Utilizing Attention Mechanisms." AI for Accelerated Materials Design - NeurIPS 2023 Workshop, 2023. external page https://openreview.net/forum?id=4ilKwquW51.
  6. Specht, Thomas, et al. "Automated Methods for Identification and Quantification of Structural Groups from Nuclear Magnetic Resonance Spectra Using Support Vector Classification." Journal of Chemical Information and Modeling, vol. 61, no. 1, 2021, pp. 143-155. external page https://doi.org/10.1021/acs.jcim.0c01186.
  7. Chongcan Li, Cong Ye, Yong Cong, Weihua Deng. "Identifying molecular functional groups of organic compounds by deep learning of NMR data." Magnetic Resonance in Chemistry, 2022-06-12. external page https://onlinelibrary.wiley.com/doi/abs/10.1002/mrc.5292.
  8. Litsa, Eleni E., et al. "An end-to-end deep learning framework for translating mass spectra to de-novo molecules." Communications Chemistry, vol. 6, 2023, pp. 1-12. external page https://www.nature.com/articles/s42004-023-00932-3.
  9. Tan, Xiaofeng. "A Transformer Based Generative Chemical Language AI Model for Structural Elucidation of Organic Compounds." 2023.
  10. Alberts, Marvin, et al. "Leveraging Infrared Spectroscopy for Automated Structure Elucidation." ChemRxiv, 2023. external page https://chemrxiv.org/engage/chemrxiv/article-details/645df5cbf2112b41e96da616.
  11. Yao, Lin, et al. "Conditional Molecular Generation Net Enables Automated Structure Elucidation Based on 13C NMR Spectra and Prior Knowledge." Analytical Chemistry, vol. 95, no. 12, 2023, pp. 5393-5401. external page https://doi.org/10.1021/acs.analchem.2c05817.
  12. Yao, Lin, Yang, Minjian, Song, Jianfei, Yang, Zhuo, Sun, Hanyu, Shi, Hui, Liu, Xue, Ji, Xiangyang, Deng, Yafeng, Wang, Xiaojian. "Conditional Molecular Generation Net Enables Automated Structure Elucidation Based on 13C NMR Spectra and Prior Knowledge." Analytical Chemistry, vol. 95, no. 12, 2023, pp. 5393-5401. external page https://doi.org/10.1021/acs.analchem.2c05817.
  13. Guilin Hu, Ming-Hua Qiu. "Machine learning-assisted structure annotation of natural products based on MS and NMR data." Journal of Natural Products, vol. 40, 2023, pp. 1735-1753. external page https://doi.org/10.1039/d3np00025g.
  14. Schmid, N., et al. "Deconvolution of 1D NMR spectra: A deep learning-based approach." Journal of Magnetic Resonance, vol. 347, 2023, p. 107357. external page https://doi.org/10.1016/j.jmr.2022.107357.
  15. Alberts, Marvin, et al. "Multimodal Transformer models for Structure Elucidation from Spectra." American Chemical Society (ACS) Spring Meeting, 2024. external page https://research.ibm.com/publications/multimodal-transformer-models-for-structure-elucidation-from-spectra.
  16. Alberts, Marvin, et al. "From Spectra to Structure: Automated structure elucidation for organic chemistry." Swiss Chemical Society, Division Medicinal Chemistry and Chemical Biology Basel Symposium, 2024. external page https://research.ibm.com/publications/from-spectra-to-structure-automated-structure-elucidation-for-organic-chemistry.

This project aims to create a benchmark of current detection networks for anatomy detection/segmentation and potentially improve these for intracranial surgery.

Supervisor: Gary Sarwin,

Professor: Ender Konukoglu

 

This project aims to use large language models to predict the acceptance of academic papers by combining textual and contextual information. Leveraging text and figures from manuscripts alongside metadata (e.g., venue/journal), the project seeks to train a model that can evaluate the likelihood of acceptance based on established criteria such as relevance, novelty, clarity, and methodological soundness.

Supervisor: Gary Sarwin,

Professor: Ender Konukoglu

 

This project explores the potential of Vision-Language Models (VLMs) to predict surgical phases solely based on visual inputs from surgical videos, leveraging pre-existing knowledge without the need for labeled data or additional training. By inputting surgical procedure videos into a VLM, the project aims to assess the model’s ability to recognize and sequence phases of surgery based on its general visual-linguistic understanding of typical surgical steps and objects.

The project hypothesizes that VLMs can inherently identify relevant surgical contexts and transitions due to their extensive pre-training on diverse multimodal datasets. Through this work, we seek to demonstrate the feasibility of using VLMs as zero-shot predictors in specialized medical tasks, with potential applications in intraoperative decision support.

Supervisor: Gary Sarwin,

Professor: Ender Konukoglu

 

Description for the Master project

One promising method for cardiovascylar disease prediction involves analyzing retinal images, as the retinal vasculature provides insights into cardiovascular health, including stroke risk. Researchers can use retinal images to assess overall health, and in previous work, a framework was developed combining graph-based retinal image representations with clinical data to enhance stroke prediction using a contrastive self-supervised model. The aim of this project is to extend our previous work. In particular, we aim to make predictions interpretable, which is crucial in the context of clinically relevant machine learning models. Further, we aim to cope with incomplete data, improve our model’s architecture by proposing state-of-the-art graph encoders, and potentially fine-tune ro build upon recent foundation models. Lastly, we aim to explore additional downstream tasks for cardiovascular diseases with potentially new modalities like brain images that can be paired with retinal fundus image graph representations. To summarize, the focus lies on exploring the use of retinal fundus image graph representations in a contrastive learning framework. In-depth literature research will be expected. The aim should also be to contribute to a scientific publication.

Your qualifications / what we are looking for

  1. Knowledge of common machine learning paradigms and architectures (graph neural networks, self-supervised learning, etc.)
  2. Knowledge of explainable and interpretable AI is advantageous
  3. Excellent programming skills in Python as well as familiarity with PyTorch (and PyTorch geometric)
  4. Full time commitment towards the completion of your project
  5. Ability to work independently on challenging projects
  6. Ability to understand scientific papers and conduct literature research
  7. Prior medical knowledge is advantageous

How to apply
Please send your CV and transcript to Neda Davoudi () and Bastian Wittmann ().
Links to previous work (e.g., your GitHub profile) are highly appreciated.

Co-Supervisors: Neda Davoudi, Bastian Wittmann, and Bjoern Menze

Professor: Ender Konukoglu
 

References:

[1] “Retinal vasculature of different diameters and plexuses exhibit distinct vulnerability in varying severity of diabetic retinopathy” (external page https://www.nature.com/articles/s41433-024-03021-4)
[2] “Geometric deep learning for disease classification in OCTA images” (external page https://iovs.arvojournals.org/article.aspx?articleid=2790851)
[3] “A foundation model for generalizable disease detection from retinal images”(external page https://www.nature.com/articles/s41586-023-06555-x)
[4] “Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data” (external page https://arxiv.org/abs/2303.14080)
[5] “GNNExplainer: Generating Explanations for Graph Neural Networks” (external page https://arxiv.org/abs/1903.03894)
[6] “TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data” (external page https://arxiv.org/abs/2407.07582)
[7] “Link prediction for flow-driven spatial networks” (external page https://arxiv.org/abs/2303.14501)
[8] "The emerging role of combined brain/heart magnetic resonance imaging for the evaluation of brain/heart interaction in heart failure" (external page https://www.mdpi.com/2077-0383/11/14/4009)
[9] Ferrando SB et al. Stroke and Retinal microvascular changes: Neuroimaging markers of brain damage and association with retinal Optical Coherence Tomography Angiography parameters.

Introduction:
In statistical learning, understanding the models' bias-variance trade-off is crucial, particularly under specific assumptions. This concept is vital from a domain generalization standpoint, as it relates to the divergence between source and target distributions. In the field of medical imaging, Castro et al. [1] have highlighted this by using a causal diagram (see Fig. 5) to illustrate medical image generation, which informs the divergence and consequently the bias-variance trade-off for an optimal statistical model.
Building on Castro et al.'s framework and the principles of the Shepp-Logan phantom [2], our project aims to develop a mechanism for generating toy medical image data. This mechanism will allow us to freely define and manipulate certain assumptions, thereby enabling fast and effective assessment of our models. Familiarity with Python and PyTorch will be beneficial, as they form the basis of our development and assessment platform.


References:
[1] Castro, D.C., Walker, I. & Glocker, B. Causality matters in medical imaging. Nat Commun 11, 3673 (2020). external page https://doi.org/10.1038/s41467-020-17478-w

[2] L. A. Shepp and B. F. Logan, "The Fourier reconstruction of a head section," in IEEE Transactions on Nuclear Science, vol. 21, no. 3, pp. 21-43, June 1974, doi: 10.1109/TNS.1974.6499235.  


Supervisor:

Güney Tombak,
 

Professor: Ender Konukoglu,

Introduction:
Recent developments in the field of computer vision have highlighted the growing prominence of foundation models, particularly those like DINOv2 [1] and Segment-Anything [2], which have achieved impressive outcomes in processing natural images. Yet, the effectiveness of these models in medical imaging remains somewhat ambiguous. This project intends to bridge this gap by rigorously examining various training methodologies for these models. Our goal is to explore the most effective approaches to adapt these advanced foundation models for medical imaging, thereby enhancing their utility and potential impact in healthcare and medical research.

References:

[1] https://dinov2.metademolab.com/
[2] https://segment-anything.com/  

Supervisors:
Ertunc Erdil,
 

Professor: Ender Konukoglu,

This project aims to design an adaptable encoder model that leverages various medical imaging datasets. The model will be trained utilizing self-supervised and contrastive learning methods such as SimCLR [1] and masked autoencoders [2], emphasizing versatility and high performance to serve multiple applications in the medical imaging sector.

The project offers an opportunity to experiment with different machine learning paradigms, improve model performance, and tackle unique challenges presented by medical image datasets. The objective is to create a robust encoder model that can effectively serve as a backbone for a variety of tasks in medical imaging. Prerequisites for this project include a solid understanding of deep learning and prior experience with PyTorch framework.

References:

[1] Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, November). A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597-1607). PMLR.

[2] He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16000-16009).


Supervisors
:
Güney Tombak ()
Ertunc Erdil ()

Professor:
Ender Konukoglu, ETF E113,

Reducing the time of magnetic resonance imaging (MRI) data acquisition is a long standing goal. Shorter acquisition times would come with many benefits, e.g. higher patient comfort or enabling dynamic imaging (e.g. the moving heart). Ultimately it can lead to higher clinical throughput, which reduces the cost of MRI for one individual and will make MRI more widely accessible.
One possible avenue towards this goal is to under-sample the acquision and incorporate prior knowledge to solve the resulting ill-posed recosntruction problem. This strategy has received much attention and many different methods have been proposed.
In this project we aim to understand performance differences between the different methods and analyse which components make them work. We will implement State-of-the-art reconstruction methods and perform experiments to judge their performance and robustness properties.


Depending on student's interests the project can have a different focus:

  • Supervised Methods [1]
  • Unsupervised Methods [2], [3], [4]
  • Untrained Methods [5]


References:

[1]: external page https://onlinelibrary.wiley.com/doi/10.1002/mrm.28827
[2]: external page https://ieeexplore.ieee.org/document/8579232
[3]: external page https://ieeexplore.ieee.org/document/9695412
[4]: external page https://link.springer.com/chapter/10.1007/978-3-031-16446-0_62
[5]: external page https://arxiv.org/abs/2111.10892

 

Supervisors:
Georg Brunner ()
Emiljo Mehillaj ()

Professor: Ender Konukoglu, ETF E113,

JavaScript has been disabled in your browser