Member-only story
Variational Autoencoders in Computer Vision
In my previous article, I’ve been talking about Autoencoders and their applications, especially in the Computer Vision field.
Generally speaking, an autoencoder (AE) learns to represent some input information (in our case, images input) by compressing them into a latent space, then reconstructing the input from its compressed form to a new, auto-generated output image (again in the original domain space).
In this article, I’m going to focus on a particular class of Autoencoders which have been introduced a bit later, that is that of Variational Autoencoders (VAEs).
VAEs are a variation of AEs in the sense that their main job is that of learning a probabilistic model, living in the latent feature space of the compressed image, from which one can sample and generate new images.
Differences between AEs and VAEs
Let’s have a look at the main differences between the two methods:
- AEs learn a compressed representation of images (of course, it holds also for other field, like NLP — in that case, input data will be texts). It does so by first compressing the image via an encoder network and then decompressing it with a decoder network back to the original domain space. The goal is having a decompressed image that is as close as…