Below are listed internship opportunities currently offered by diiP. These offers are open to second year Master’s students.

Biomarker prediction from images of histological slides of cancerous tissues using modern AI techniques: comparisons of different neural architectures

Supervised by: Nicolas Loménie (Université Paris Cité)

Description:  Digital Pathology is on the rise. Most exams are still done visually under the analogical microscope. But new scanners have emerged (Phillips, Hammamatsu, etc.) driving revolutionary changes at the clinical level (Fig. 1). Given the rich information present in these images, a vast field of opportunity opens up for automated image analysis. This is especially pertinent in oncology where early and accurate diagnosis is paramount and with the advent of transformer architectures, traditionally used in natural language processing tasks, there’s an emerging interest in assessing its applicability for WSI image classification.

How to apply: please send a motivation letter and a CV to nicolas.lomenie[at]

Further information can be found here

Development of automated segmentation and clustering methods for spICP-ToF-MS time-series in Nanogeochemistry

Supervised by: Mickaël Tharaud (IPGP)

Description:  Nanoparticles (NPs) are pervasive in natural systems, playing a crucial role in nanogeochemistry. The emergence of single-particle time-of-flight inductively coupled plasma mass spectrometry (spICP-ToF-MS) has revolutionized NP characterization, presenting new challenges in data analysis. This research project seeks to bridge advanced nano-instrumentation with data-driven insights, focusing on the development of standardized methodologies for integrating spICP-ToF-MS with state-of-the-art machine learning algorithms. The IPGP hosts a world-leading geochemistry platform (PARI) equipped with an operational spICP-ToF-MS instrument and possesses an extensive dataset to (1) develop a novel methodology for the automated segmentation and clustering of NP time series generated by spICP-ToF-MS, (2) address challenges including instrumental noise, unknown NP compositions, and large data volumes requiring sophisticated statistical methods, and (3) explore interdisciplinary collaboration between geochemists, data scientists, and analytical chemists.

Proposed methods: Preliminary tests have shown encouraging results using a 4-step methodology described and illustrated below:
1. Detection: Establish a conservative threshold for detecting significant NP signals within time series data using intensity distribution across channels (b).
2. Clustering: Identify families of NP signals through unsupervised clustering algorithms, considering the unknown number of NP families in natural environments.
3. Classification: Train a classifier to differentiate various NPs within continuous time series, including an additional noise class, using realistic data.
4. Segmentation: Divide time series into segments based on the classifier, addressing the challenge of determining optimal segmentation window size (c).

How to apply: please send a motivation letter and a CV to tharaud[at]

Further information can be found here.

Investigating Diffusion Models for Astronomical Image Deconvolution - boosting the synergy between Euclid and LSST

Supervised by: Alexandre Boucaud (Université Paris Cité)

Description:  The European Space Agency’s Euclid mission, with its space-based observations, provides high spatial resolution images but has limited wavelength coverage. Conversely, Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), a ground-based survey, offers extensive wavelength coverage but faces challenges in spatial resolution due to atmospheric distortions. Unifying these two datasets via deconvolution methods can lead to unparalleled high-resolution, multi-wavelength observations of the universe, addressing a variety of astrophysical and cosmological questions. The proposed project aims to develop and investigate the efficacy of diffusion models for astronomical image deconvolution hence maximizing the synergistic datasets of Euclid and LSST. Image deconvolution is the process of reversing the optical distortion that takes place during image capture. In astronomical imaging, this is especially crucial given the presence of atmospheric turbulence, instrumental noise, and other sources of degradation that can compromise the clarity and quality of the observations. Traditional algorithms have long been used for deconvolution, but with the rise of AI, particularly deep learning, there’s been a paradigm shift in the way these problems are approached. However, even if some works exist as a proof-of-concept with very promising results using Generative Adversarial Networks for example, the deployment for scientific analysis still does not exist, partly because of stability problems (hallucination) and lack of flexibility. Diffusion models have recently gained traction in the machine learning community as a powerful generative model. They model the data generation process as a diffusion process, essentially a Markov Chain transitioning from a noisy version of the data to the clean data over several timesteps. Compared to other generative models such as GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), and normalizing flows, diffusion models have several key advantages which are fundamental for our purpose: Sample Quality, Stability of Training, Diversity of Generated Samples, Flexibility, Robustness to Noise and easy Conditional Generation among others. The goal of this project is therefore to develop and test diffusion models for astronomical image deconvolution, with the idea of maximizing the synergies between Euclid’s high spatial resolution and LSST extensive wavelength coverage. Given the unavailability of both actual Euclid and LSST data, the project will utilize high-resolution, denoised images from the Illustris TNG hydrodynamical simulations as prior information, and mock observations from the Hyper Suprime Cam (HSC) survey as the input images.

How to apply: please send a motivation letter and a CV to aboucaud[a]

Further information can be found here.

À lire aussi

diiP Summer School: June 10-14, 2024

diiP Summer School: June 10-14, 2024

The diiP is organizing a Summer School on Data Science (with a focus on deep learning data analytics techniques), on Jun10-14. Read the details below, and register now! The first diiP Summer School on Data Science (dSDS) will be held...

diiP Projects Day: December 6th, 2023

diiP Projects Day: December 6th, 2023

Join us for the diiP Projects Day, an in-person event that will highlight past and upcoming projects, offer opportunities for discussions and networking, and host Prof. Joseph Sifakis (Turing Award winner, 2007) for the last Distinguished Lecture of 2023....

diiP call for proposals 2024

diiP call for proposals 2024

The Data Intelligence of Paris funds two types of projects: Strategic projects and Master's internships. Those projects start in January. The deadline to apply for the 2024 call for projects is October 22nd, 2023, at 11:59 PM (CET). ...