2024 internship opportunities for Master’s students

Below are listed internship opportunities currently offered by diiP. These offers are open to second year Master’s students.

Biomarker prediction from images of histological slides of cancerous tissues using modern AI techniques: comparisons of different neural architectures

Supervised by: Nicolas Loménie (Université Paris Cité)

Description: Digital Pathology is on the rise. Most exams are still done visually under the analogical microscope. But new scanners have emerged (Phillips, Hammamatsu, etc.) driving revolutionary changes at the clinical level (Fig. 1). Given the rich information present in these images, a vast field of opportunity opens up for automated image analysis. This is especially pertinent in oncology where early and accurate diagnosis is paramount and with the advent of transformer architectures, traditionally used in natural language processing tasks, there’s an emerging interest in assessing its applicability for WSI image classification.

How to apply: please send a motivation letter and a CV to nicolas.lomenie[at]mi.parisdescartes.fr

Further information can be found here

Development of automated segmentation and clustering methods for spICP-ToF-MS time-series in Nanogeochemistry

Supervised by: Mickaël Tharaud (IPGP)

Description: Nanoparticles (NPs) are pervasive in natural systems, playing a crucial role in nanogeochemistry. The emergence of single-particle time-of-flight inductively coupled plasma mass spectrometry (spICP-ToF-MS) has revolutionized NP characterization, presenting new challenges in data analysis. This research project seeks to bridge advanced nano-instrumentation with data-driven insights, focusing on the development of standardized methodologies for integrating spICP-ToF-MS with state-of-the-art machine learning algorithms. The IPGP hosts a world-leading geochemistry platform (PARI) equipped with an operational spICP-ToF-MS instrument and possesses an extensive dataset to (1) develop a novel methodology for the automated segmentation and clustering of NP time series generated by spICP-ToF-MS, (2) address challenges including instrumental noise, unknown NP compositions, and large data volumes requiring sophisticated statistical methods, and (3) explore interdisciplinary collaboration between geochemists, data scientists, and analytical chemists.

Proposed methods: Preliminary tests have shown encouraging results using a 4-step methodology described and illustrated below:
1. Detection: Establish a conservative threshold for detecting significant NP signals within time series data using intensity distribution across channels (b).
2. Clustering: Identify families of NP signals through unsupervised clustering algorithms, considering the unknown number of NP families in natural environments.
3. Classification: Train a classifier to differentiate various NPs within continuous time series, including an additional noise class, using realistic data.
4. Segmentation: Divide time series into segments based on the classifier, addressing the challenge of determining optimal segmentation window size (c).

How to apply: please send a motivation letter and a CV to tharaud[at]ipgp.fr

Further information can be found here.

Investigating Diffusion Models for Astronomical Image Deconvolution - boosting the synergy between Euclid and LSST

Supervised by: Alexandre Boucaud (Université Paris Cité)

Description: The European Space Agency’s Euclid mission, with its space-based observations, provides high spatial resolution images but has limited wavelength coverage. Conversely, Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), a ground-based survey, offers extensive wavelength coverage but faces challenges in spatial resolution due to atmospheric distortions. Unifying these two datasets via deconvolution methods can lead to unparalleled high-resolution, multi-wavelength observations of the universe, addressing a variety of astrophysical and cosmological questions. The proposed project aims to develop and investigate the efficacy of diffusion models for astronomical image deconvolution hence maximizing the synergistic datasets of Euclid and LSST. Image deconvolution is the process of reversing the optical distortion that takes place during image capture. In astronomical imaging, this is especially crucial given the presence of atmospheric turbulence, instrumental noise, and other sources of degradation that can compromise the clarity and quality of the observations. Traditional algorithms have long been used for deconvolution, but with the rise of AI, particularly deep learning, there’s been a paradigm shift in the way these problems are approached. However, even if some works exist as a proof-of-concept with very promising results using Generative Adversarial Networks for example, the deployment for scientific analysis still does not exist, partly because of stability problems (hallucination) and lack of flexibility. Diffusion models have recently gained traction in the machine learning community as a powerful generative model. They model the data generation process as a diffusion process, essentially a Markov Chain transitioning from a noisy version of the data to the clean data over several timesteps. Compared to other generative models such as GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), and normalizing flows, diffusion models have several key advantages which are fundamental for our purpose: Sample Quality, Stability of Training, Diversity of Generated Samples, Flexibility, Robustness to Noise and easy Conditional Generation among others. The goal of this project is therefore to develop and test diffusion models for astronomical image deconvolution, with the idea of maximizing the synergies between Euclid’s high spatial resolution and LSST extensive wavelength coverage. Given the unavailability of both actual Euclid and LSST data, the project will utilize high-resolution, denoised images from the Illustris TNG hydrodynamical simulations as prior information, and mock observations from the Hyper Suprime Cam (HSC) survey as the input images.

How to apply: please send a motivation letter and a CV to aboucaud[a]apc.in2p3.fr

Further information can be found here.

Biomarker prediction from images of histological slides of cancerous tissues using modern AI techniques: comparisons of different neural architectures

Development of automated segmentation and clustering methods for spICP-ToF-MS time-series in Nanogeochemistry

Investigating Diffusion Models for Astronomical Image Deconvolution - boosting the synergy between Euclid and LSST

À lire aussi

Nikos Paragios – Seeing the Invisible – Doing the Impossible: Reinventing Healthcare with Generative AI-powered diagnosis, treatment and beyond

Deeply Learning from Neutrino Interactions with the KM3NeT neutrino telescope

Alon Halevy – Well-being, AI, and You: Developing AI-based Technology for Well-being

Shen Liang – Knowledge-guided Data Science