Foula Vagena
April 13, 4 PM
online (zoom)

 

Abstract

Generative modeling is a field in machine learning that involves automatically discovering and learning regularities or patterns in the data in such a way that the ML model can generate or output new examples that plausibly could have been drawn from the original dataset. In statistical machine learning a generative model explicitly describes the joint probability distribution of input (X) and output (Y) variables, i.e. P(Y, X). Various such models are commonly used in practice such as Naive Bayes, HMMs and MRFs. Deep generative models (DGMs) are neural networks with many hidden layers trained to approximate complicated, high-dimensional and at times unknown probability distributions using a large number of samples. The literature on DGMs is growing rapidly and some advances have reached the public sphere, for example, the recent successes in generating realistic-looking images, voices, or movies; so-called deep fakes, usually employing variational autoencoders (VAE), or generative adversarial networks (GAN). In this tutorial we give an overview of generative modeling with the aim (1) to provide a broad overview of the field and (2) to the possible extent, identify the common ground as well as main differences of the two approaches. The tutorial will conclude with a code example of (1) a statistical generative model (Naive Bayes) and (2) a simple DL generative model (GAN).

Hands-On Workshop on “Generative Naïve Bayes + GAN examples”.

 

Dr Foula Vagena
(Université Paris Cité, diiP)
Zografoula Vagena is a research associate at the Data Intelligence Institute of Paris (diiP) and affiliated with the Université Paris Cité. She has been a data science researcher and practitioner for over ten years. She has worked on different analytics problems including forecasting, image processing, graph analytics, multidimensional data analysis, text processing, recommendation systems, sequential data analysis and optimization within various fields such as transportation, healthcare, retail, finance/insurance and accounting. She has also performed research in the intersection of data management and analytics, and was a primary contributor of the MCDB/SimSQL systems that blended data management with Bayesian statistics. She holds a PhD in data management from the University of California, Riverside.

Click the image to see slide

Other seminars