Generative III - GAN training
Presenter | Papers | Paper URL | Our Slides |
---|---|---|---|
Arshdeep | Generalization and Equilibrium in Generative Adversarial Nets (ICML17) 1 | PDF + video | |
Arshdeep | Mode Regularized Generative Adversarial Networks (ICLR17) 2 | ||
Bargav | Improving Generative Adversarial Networks with Denoising Feature Matching, ICLR17 3 | ||
Anant | Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy, ICLR17 4 | PDF + code |
-
Generalization and Equilibrium in Generative Adversarial Nets (ICML17)/ We show that training of generative adversarial network (GAN) may not have good generalization properties; e.g., training may appear successful but the trained distribution may be far from target distribution in standard metrics. However, generalization does occur for a weaker metric called neural net distance. It is also shown that an approximate pure equilibrium exists in the discriminator/generator game for a special class of generators with natural training objectives when generator capacity and training set sizes are moderate. This existence of equilibrium inspires MIX+GAN protocol, which can be combined with any existing GAN training, and empirically shown to improve some of them. ↩
-
Mode Regularized Generative Adversarial Networks (ICLR17)/ Although Generative Adversarial Networks achieve state-of-the-art results on a variety of generative tasks, they are regarded as highly unstable and prone to miss modes. We argue that these bad behaviors of GANs are due to the very particular functional shape of the trained discriminators in high dimensional spaces, which can easily make training stuck or push probability mass in the wrong direction, towards that of higher concentration than that of the data generating distribution. We introduce several ways of regularizing the objective, which can dramatically stabilize the training of GAN models. We also show that our regularizers can help the fair distribution of probability mass across the modes of the data generating distribution, during the early phases of training and thus providing a unified solution to the missing modes problem. ↩
-
Improving Generative Adversarial Networks with Denoising Feature Matching, ICLR17 / David Warde-Farley, Yoshua Bengio/ Abstract: We propose an augmented training procedure for generative adversarial networks designed to address shortcomings of the original by directing the generator towards probable configurations of abstract discriminator features. We estimate and track the distribution of these features, as computed from data, with a denoising auto-encoder, and use it to propose high-level targets for the generator. We combine this new loss with the original and evaluate the hybrid criterion on the task of unsupervised image synthesis from datasets comprising a diverse set of visual categories, noting a qualitative and quantitative improvement in the objectness’’ of the resulting samples. ↩
-
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy, ICLR17 / Dougal J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, Arthur Gretton / We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model’s samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier. ↩