Empirical evaluation of the effectiveness of variational autoencoders on data augmentation for the image classification problem
Abstract
In the last decade, deep learning methods have become the key solution for various machine learning problems. One major drawback of deep learning methods is that they require large datasets to have a good generalization performance. Researchers propose data augmentation techniques for generating synthetic data to overcome this problem. Traditional methods, such as flipping, rotation etc., which are referred as transformation based methods in this study are commonly used for obtaining synthetic data in the literature. These methods take as input an image and process that image to obtain a new one. On the other hand, generative models such as generative adversarial networks, auto-encoders, after trained with aset of image learn to generatesyntheticdata. Recently generative models are commonly used for data augmentation in various domains. In this study, we evaluate the effectiveness of a generative model, variational autoencoders (VAE), on the image classification problem. For this purpose, we train a VAE using CIFAR-10 dataset and generate synthetic samples with this model. We evaluate the classification performance using various sized datasets and compare the classification performances on four datasets; dataset without augmentation, dataset augmented with VAE and two datasets augmented with transformation based methods. We observe that the contribution of data augmentation is sensitive to the size of the dataset and VAE augmentation is as effective as the transformation based augmentation methods.