Potential Readings of Deep Learning We Plan to Finish in 2017-Fall


  1. DeepLearningSummerSchool17 + videolectures
  2. Andrew Ng - Nuts and Bolts of Applying Deep Learning : https://www.youtube.com/watch?v=F1ka6a13S9I :
  3. Ganguli - Theoretical Neuroscience and Deep Learning DLSS16 http://videolectures.net/deeplearning2016_ganguli_theoretical_neuroscience/
  4. Ganguli - Theoretical Neuroscience and Deep Learning.pdf DLSS17 https://drive.google.com/file/d/0B6NHiPcsmak1dkZMbzc2YWRuaGM/view
  5. Sharp Minima Can Generalize For Deep Nets, Laurent Dinh (Univ. Montreal), Razvan Pascanu, Samy Bengio (Google Brain), Yoshua Bengio (Univ. Montreal)
  6. Automated Curriculum Learning for Neural Networks, Alex Graves, Marc G. Bellemare, Jacob Menick, Koray Kavukcuoglu, Remi Munos
  7. Learning to learn without gradient descent by gradient descent, Yutian Chen, Matthew Hoffman, Sergio Gomez, Misha Denil, Timothy Lillicrap, Matthew Botvinick , Nando de Freitas
  8. Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study, Samuel Ritter, David Barrett, Adam Santoro, Matt Botvinick
  9. Geometry of Neural Network Loss Surfaces via Random Matrix Theory, Jeffrey Pennington, Yasaman Bahri
  10. On the Expressive Power of Deep Neural Networks, Maithra Raghu, Ben Poole, Surya Ganguli, Jon Kleinberg, Jascha Sohl-Dickstein
  11. Neuroscience-Inspired Artificial Intelligence, http://www.cell.com/neuron/fulltext/S0896-6273(17)30509-3
  12. Understanding deep learning requires rethinking generalization, ICLR17
  13. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima, ICLR17
  14. Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes, ICLR17
  15. Capacity and Trainability in Recurrent Neural Networks, ICLR17
  16. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations, ICLR17
  17. Frustratingly Short Attention Spans in Neural Language Modeling, ICLR17
  18. Topology and Geometry of Half-Rectified Network Optimization, ICLR17
  19. Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning, ICLR17
  20. Adversarial Feature Learning, ICLR17
  21. Do Deep Convolutional Nets Really Need to be Deep and Convolutional?, ICLR17
  22. Why Deep Neural Networks for Function Approximation?, ICLR17
  23. Bengio - Recurrent Neural Networks - DLSS 2017.pdf: https://drive.google.com/file/d/0ByUKRdiCDK7-LXZkM3hVSzFGTkE/view
  24. On the Expressive Power of Deep Neural Networks, Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein ; PMLR 70:2847-2854
  25. Equivariance Through Parameter-Sharing, Siamak Ravanbakhsh, Jeff Schneider, Barnabás Póczos ; PMLR 70:2892-2901
  26. Large-Scale Evolution of Image Classifiers, Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V. Le, Alexey Kurakin ; PMLR 70:2902-2911
  27. Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks, Itay Safran, Ohad Shamir ; PMLR 70:2979-2987
  28. A Closer Look at Memorization in Deep Networks, ICML17
  29. Dynamic Word Embeddings, ICML17
  30. Combining Low-Density Separators with CNNs, Yu-Xiong Wang*, Carnegie Mellon University; Martial Hebert, Carnegie Mellon University, NIPS16
  31. CNNpack: Packing Convolutional Neural Networks in the Frequency Domain, NIPS16
  32. Residual Networks are Exponential Ensembles of Relatively Shallow Networks, NIPS16
  33. Dense Associative Memory for Pattern Recognition, NIPS16
  34. Learning Kernels with Random Features, Aman Sinha*, Stanford University; John Duchi,
  35. Simple and Efficient Weighted Minwise Hashing, NIPS16
  36. Reward Augmented Maximum Likelihood for Neural Structured Prediction
  37. Unimodal Probability Distributions for Deep Ordinal Classification, ICML17
  38. End-to-End Learning for Structured Prediction Energy Networks, ICML17
  39. Orthogonal Random Features, NIPS16
  40. Learning Structured Sparsity in Deep Neural Networks, NIPS16
  41. Learning the Number of Neurons in Deep Networks, NIPS16
  42. Quantized Random Projections and Non-Linear Estimation of Cosine Similarity, NIPS16
  43. An equivalence between high dimensional Bayes optimal inference and M-estimation, NIPS16
  44. High Dimensional Structured Superposition Models, NIPS16
  45. Learning Deep Embeddings with Histogram Loss, NIPS16
  46. Learning values across many orders of magnitude, NIPS16
  47. Learning Deep Parsimonious Representations, NIPS16
  48. Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information, NIPS16
  49. A Bayesian method for reducing bias in neural representational similarity analysis, NIPS16
  50. Richards - Deep_Learning_in_the_Brain.pd https://drive.google.com/file/d/0B2A1tnmq5zQdcFNkWU1vdDJiT00/view and https://drive.google.com/file/d/0B2A1tnmq5zQdQWU0Skd6TVVQYUE/view?usp=drive_web

DNN with Varying Structures

  1. SCAN: Learning Abstract Hierarchical Compositional Visual Concepts, https://arxiv.org/pdf/1707.03389.pdf
  2. Krueger - Bayesian Hypernetworks.pdf https://drive.google.com/file/d/0B6NHiPcsmak1RUlucW1RN29oS3M/view?usp=drive_web
  3. Leblond and Alayrac - SeaRNN.pdf https://drive.google.com/file/d/0B6NHiPcsmak1SDVEaWc0OWtaV0k/view?usp=drive_web
  4. Sharir - Overlapping Architectures.pdf https://drive.google.com/file/d/0B6NHiPcsmak1ZzVkci1EdVN2YkU/view?usp=drive_web
  5. Ullrich - Bayesian Compression.pd https://drive.google.com/file/d/0B6NHiPcsmak1WlRUeHFpSW5OZGc/view?usp=drive_web
  6. Understanding Synthetic Gradients and Decoupled Neural Interfaces, Wojtek Czarnecki, Grzegorz Świrszcz, Max Jaderberg, Simon Osindero, Oriol Vinyals, Koray Kavukcuoglu, ICML17
  7. Video Pixel Networks, Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
  8. AdaNet: Adaptive Structural Learning of Artificial Neural Networks, Corinna Cortes, Xavi Gonzalvo, Vitaly Kuznetsov, Mehryar Mohri, Scott Yang
  9. Learning to Generate Long-term Future via Hierarchical Prediction, Ruben Villegas, Jimei Yang, Yuliang Zou, Sungryull Sohn, Xunyu Lin, Honglak Lee
  10. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning, Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli
  11. Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data, Manzil Zaheer, Amr Ahmed, Alex Smola
  12. Large-Scale Evolution of Image Classifiers, Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc Le, Alexey Kurakin
  13. Sequence Modeling via Segmentations, Chong Wang (Microsoft Research) · Yining Wang (CMU) · Po-Sen Huang (Microsoft Research) · Abdelrahman Mohammad (Microsoft) · Dengyong Zhou (Microsoft Research) · Li Deng (Citadel)
  14. ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices
  15. Adaptive Neural Networks for Fast Test-Time Prediction
  16. Making Neural Programming Architectures Generalize via Recursion, ICLR17
  17. Optimization as a Model for Few-Shot Learning, ICLR17
  18. Learning End-to-End Goal-Oriented Dialog, ICLR17
  19. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR17
  20. Nonparametric Neural Networks, ICLR17
  21. An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax, ICLR17
  22. Improving Neural Language Models with a Continuous Cache, ICLR17
  23. Variational Recurrent Adversarial Deep Domain Adaptation, ICLR17
  24. Soft Weight-Sharing for Neural Network Compression, ICLR17
  25. Tracking the World State with Recurrent Entity Networks, (Lecun), ICLR17
  26. Deep Biaffine Attention for Neural Dependency Parsing, ICLR17
  27. Learning to Remember Rare Events, ICLR17
  28. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks, ICLR17
  29. Deep Learning with Dynamic Computation Graphs, ICLR17
  30. Query-Reduction Networks for Question Answering, ICLR17
  31. Bidirectional Attention Flow for Machine Comprehension, ICLR17
  32. Dynamic Coattention Networks For Question Answering, ICLR17
  33. Structured Attention Networks, ICLR17
  34. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, (Dean), ICLR17
  35. Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain, ICLR17
  36. Mollifying Networks, Bengio, ICLR17
  37. Automatic Rule Extraction from Long Short Term Memory Networks, ICLR17
  38. Loss-aware Binarization of Deep Networks, ICLR17
  39. Deep Multi-task Representation Learning: A Tensor Factorisation Approach, ICLR17
  40. Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music, ICLR17
  41. Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17
  42. Semi-Supervised Classification with Graph Convolutional Networks, ICLR17
  43. Hierarchical Multiscale Recurrent Neural Networks, ICLR17
  44. AdaNet: Adaptive Structural Learning of Artificial Neural Networks, ICML17
  45. Language Modeling with Gated Convolutional Networks, ICML17
  46. Image-to-Markup Generation with Coarse-to-Fine Attention, ICML17
  47. Input Switched Affine Networks: An RNN Architecture Designed for Interpretability, ICML17
  48. Differentiable Programs with Neural Libraries, ICML17
  49. Convolutional Sequence to Sequence Learning, ICML17
  50. State-Frequency Memory Recurrent Neural Networks, ICML17
  51. SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization, Juyong Kim, Yookoon Park, Gunhee Kim, Sung Ju Hwang ; PMLR 70:1866-1874
  52. Deriving Neural Architectures from Sequence and Graph Kernels Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola ; PMLR 70:2024-2033
  53. Delta Networks for Optimized Recurrent Network Computation, Daniel Neil, Jun Haeng Lee, Tobi Delbruck, Shih-Chii Liu ; PMLR 70:2584-2593
  54. Recurrent Highway Networks, Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutnı́k, Jürgen Schmidhuber ; PMLR 70:4189-4198
  55. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, ICML17
  56. OptNet: Differentiable Optimization as a Layer in Neural Networks, ICML17
  57. Swapout: Learning an ensemble of deep architectures, Saurabh Singh*, UIUC; Derek Hoiem, UIUC; David Forsyth, UIUC, NIPS16
  58. Natural-Parameter Networks: A Class of Probabilistic Neural Networks, Hao Wang*, HKUST; Xingjian Shi, ; Dit-Yan Yeung, NIPS16
  59. Learning What and Where to Draw, NIPS16
  60. Hierarchical Question-Image Co-Attention for Visual Question Answering, NIPS16
  61. Proximal Deep Structured Models, NIPS16
  62. Direct Feedback Alignment Provides Learning In Deep Neural Networks, NIPS16
  63. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes, NIPS16
  64. Matching Networks for One Shot Learning, NIPS16
  65. Can Active Memory Replace Attention? Łukasz Kaiser*, ; Samy Bengio, NIPS16
  66. Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences, NIPS16
  67. Binarized Neural Networks, NIPS16
  68. Interaction Networks for Learning about Objects, Relations and Physics, NIPS16
  69. Optimal Architectures in a Solvable Model of Deep Networks, NIPS16

Reliable and Benchmarking and Applications

  1. Conditional Image Generation with Pixel CNN Decoders, NIPS16
  2. Dhruv - Visual Dialog - RLSS 2017 https://drive.google.com/file/d/0BzUSSMdMszk6RndSbkEzcnRFMGs/view and https://drive.google.com/file/d/0BzUSSMdMszk6cDVBMlRqLUs3TFk/view
  3. Input Switched Affine Networks: An RNN Architecture Designed for Interpretability, Jakob Foerster, Justin Gilmer, Jan Chorowski, Jascha Sohl-Dickstein, David Sussillo
  4. Axiomatic Attribution for Deep Networks, Ankur Taly, Qiqi Yan,,Mukund Sundararajan
  5. Differentiable Programs with Neural Libraries, Alex L Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow
  6. Neural Optimizer Search with Reinforcement Learning, Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc Le
  7. Measuring Sample Quality with Kernels, Jackson Gorham (STANFORD) · Lester Mackey (Microsoft Research)
  8. Learning Continuous Semantic Representations of Symbolic Expressions, ICML17
  9. Recovery Guarantees for One-hidden-layer Neural Networks, ICML17
  10. On the State of the Art of Evaluation in Neural Language Models, https://arxiv.org/abs/1707.05589
  11. End-to-end Optimized Image Compression, ICLR17
  12. Multi-Agent Cooperation and the Emergence of (Natural) Language, ICLR17
  13. Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR17
  14. Deep Learning with Differential Privacy,
  15. Privacy-Preserving Deep Learning, CCS15
  16. Learning to Query, Reason, and Answer Questions On Ambiguous Texts, ICLR17
  17. Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy, ICLR17
  18. Data Noising as Smoothing in Neural Network Language Models (Ng), ICLR17
  19. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks, ICLR17
  20. Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, ICLR17
  21. On Detecting Adversarial Perturbations, ICLR17
  22. Delving into Transferable Adversarial Examples and Black-box Attacks, ICLR17
  23. Parseval Networks: Improving Robustness to Adversarial Examples, ICML17
  24. iSurvive: An Interpretable, Event-time Prediction Model for mHealth, ICML17
  25. Being Robust (in High Dimensions) Can Be Practical, ICML17
  26. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML17
  27. On Calibration of Modern Neural Networks, ICML17
  28. Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs, ICML17
  29. Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation, ICML17
  30. Analogical Inference for Multi-relational Embeddings, Hanxiao Liu, Yuexin Wu, Yiming Yang ; PMLR 70:2168-2178
  31. Deep Transfer Learning with Joint Adaptation Networks, Mingsheng Long, Han Zhu, Jianmin Wang, Michael I. Jordan ; PMLR 70:2208-2217
  32. Sequence to Better Sequence: Continuous Revision of Combinatorial Structures, Jonas Mueller, David Gifford, Tommi Jaakkola ; PMLR 70:2536-2544
  33. Meta Networks, Tsendsuren Munkhdalai, Hong Yu ; PMLR 70:2554-2563
  34. Geometry of Neural Network Loss Surfaces via Random Matrix Theory, Jeffrey Pennington, Yasaman Bahri ; PMLR 70:2798-2806
  35. Asymmetric Tri-training for Unsupervised Domain Adaptation, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada ; PMLR 70:2988-2997
  36. Developing Bug-Free Machine Learning Systems With Formal Mathematics, Daniel Selsam, Percy Liang, David L. Dill ; PMLR 70:3047-3056
  37. Learning Important Features Through Propagating Activation Differences, Avanti Shrikumar, Peyton Greenside, Anshul Kundaje ; PMLR 70:3145-3153
  38. High-Dimensional Structured Quantile Regression, ICML17
  39. Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs, Rakshit Trivedi, Hanjun Dai, Yichen Wang, Le Song ; PMLR 70:3462-3471
  40. Learning to Generate Long-term Future via Hierarchical Prediction, Ruben Villegas, Jimei Yang, Yuliang Zou, Sungryull Sohn, Xunyu Lin, Honglak Lee ; PMLR 70:3560-3569
  41. Sequence Modeling via Segmentations, Chong Wang, Yining Wang, Po-Sen Huang, Abdelrahman Mohamed, Dengyong Zhou, Li Deng ; PMLR 70:3674-3683
  42. A Unified View of Multi-Label Performance Measures, Xi-Zhu Wu, Zhi-Hua Zhou ; PMLR 70:3780-3788
  43. Convexified Convolutional Neural Networks, Yuchen Zhang, Percy Liang, Martin J. Wainwright ; PMLR 70:4044-4053
  44. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin, ICML17
  45. Learning Transferrable Representations for Unsupervised Domain Adaptation, NIPS16
  46. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity, NIPS16
  47. Unsupervised Domain Adaptation with Residual Transfer Networks, Mingsheng Long*, Tsinghua University; Han Zhu, Tsinghua University; Jianmin Wang, Tsinghua University; Michael Jordan, NIPS16
  48. Interpretable Distribution Features with Maximum Testing Power, Wittawat Jitkrittum*, Gatsby Unit, UCL; Zoltan Szabo, ; Kacper Chwialkowski, Gatsby Unit, UCL; Arthur Gretton, NIPS16
  49. Domain Separation Networks, NIPS16
  50. Multimodal Residual Learning for Visual QA, NIPS16
  51. Learning feed-forward one-shot learners, NIPS16
  52. Adversarial Multiclass Classification: A Risk Minimization Perspective, NIPS16
  53. Generating Images with Perceptual Similarity Metrics based on Deep Networks, NIPS16
  54. Dialog-based Language Learning, Jason Weston*, NIPS16
  55. The Robustness of Estimator Composition, NIPS16
  56. Large Margin Discriminant Dimensionality Reduction in Prediction Space, NIPS16
  57. Robustness of classifiers: from adversarial to random noise, NIPS16
  58. Examples are not Enough, Learn to Criticize! Model Criticism for Interpretable Machine Learning, NIPS16
  59. Blind Attacks on Machine Learners, Alex Beatson*, Princeton University; Zhaoran Wang, Princeton University; Han Liu, NIPS16
  60. Composing graphical models with neural networks for structured representations and fast inference, NIPS16
  61. Spatiotemporal Residual Networks for Video Action Recognition, NIPS16
  62. Learning Important Features Through Propagating Activation Differences, ICML17


  1. Johnson - Automatic Differentiation.p https://drive.google.com/file/d/0B6NHiPcsmak1ckYxR2hmRGdzdFk/view
  2. Osborne - Probabilistic numerics for deep learning - DLSS 2017.pdf https://drive.google.com/file/d/0B2A1tnmq5zQdWHBYOFctNi1KdVU/view
  3. Learned Optimizers that Scale and Generalize, Olga Wichrowska, Niru Maheswaranathan, Matthew Hoffman, Sergio Gomez, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein
  4. Learning to learn by gradient descent by gradient descent
  5. Asynchronous Stochastic Gradient Descent with Delay Compensation,
  6. How to Escape Saddle Points Efficiently, Chi Jin (UC Berkeley) · Rong Ge (Duke University) · Praneeth Netrapalli (Microsoft Research) · Sham M. Kakade (University of Washington) · Michael Jordan (UC Berkeley)
  7. Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
  8. Batched High-dimensional Bayesian Optimization via Structural Kernel Learning
  9. Towards Principled Methods for Training Generative Adversarial Networks, ICLR17
  10. Optimization as a Model for Few-Shot Learning, ICLR17
  11. Amortised MAP Inference for Image Super-resolution, ICLR17
  12. Neural Architecture Search with Reinforcement Learning, ICLR17
  13. Distributed Second-Order Optimization using Kronecker-Factored Approximations, ICLR17
  14. Mode Regularized Generative Adversarial Networks, ICLR17
  15. Highway and Residual Networks learn Unrolled Iterative Estimation, ICLR17
  16. Snapshot Ensembles: Train 1, Get M for Free, ICLR17
  17. Learning to Optimize, ICLR17
  18. Recurrent Batch Normalization, ICLR17
  19. Adversarially Learned Inference, ICLR17
  20. Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17
  21. Deep ADMM-Net for Compressive Sensing MRI, NIPS16
  22. Sharp Minima Can Generalize For Deep Nets, ICML17
  23. Forward and Reverse Gradient-Based Hyperparameter Optimization, ICML17
  24. Automated Curriculum Learning for Neural Networks, ICML17
  25. How to Escape Saddle Points Efficiently, ICML17
  26. Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs, ICML17
  27. An overview of gradient optimization algorithms, (https://arxiv.org/abs/1609.04747)
  28. Learning Deep Architectures via Generalized Whitened Neural Networks, Ping Luo ; PMLR 70:2238-2246
  29. The Loss Surface of Deep and Wide Neural Networks, Quynh Nguyen, Matthias Hein ; PMLR 70:2603-2612
  30. Relative Fisher Information and Natural Gradient for Learning Large Modular Models, Ke Sun, Frank Nielsen ; PMLR 70:3289-3298
  31. meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting, Xu Sun, Xuancheng Ren, Shuming Ma, Houfeng Wang ; PMLR 70:3299-3308
  32. Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan ; PMLR 70:3319-3328
  33. Follow the Moving Leader in Deep Learning, Shuai Zheng, James T. Kwok ; PMLR 70:4110-4119
  34. Oracle Complexity of Second-Order Methods for Finite-Sum Problems, ICML17
  35. The Shattered Gradients Problem: If resnets are the answer, then what is the question?, ICML17
  36. Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks, ICML17
  37. End-to-End Differentiable Adversarial Imitation Learning, ICML17
  38. Neural Optimizer Search with Reinforcement Learning, ICML17
  39. Adaptive Neural Networks for Efficient Inference, ICML17
  40. Practical Gauss-Newton Optimisation for Deep Learning, ICML17
  41. Deep Tensor Convolution on Multicores, ICML17
  42. The Generalized Reparameterization Gradient, Francisco Ruiz*, Columbia University; Michalis K. Titsias, ; David Blei, NIPS16
  43. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models, NIPS16
  44. Memory-Efficient Backpropagation Through Time, NIPS16
  45. Professor Forcing: A New Algorithm for Training Recurrent Networks, NIPS16
  46. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks, NIPS16


  1. GAN tutorial by Ian Goodfellow (NIPS 2016): https://arxiv.org/abs/1701.00160 https://www.youtube.com/watch?v=AJVyzd0rqdc
  2. Goodfellow - Generative Models I - DLSS 2017 https://drive.google.com/file/d/0ByUKRdiCDK7-bTgxTGoxYjQ4NW8/view
  3. Courville - Generative Models II - DLSS 2017. https://drive.google.com/file/d/0B_wzP_JlVFcKQ21udGpTSkh0aVk/view
  4. Makhzani and Frey - PixelGAN Autoencoders.pdf https://drive.google.com/file/d/0B6NHiPcsmak1SFdRN2lmS3FnekE/view
  5. Welling - Graphical Models and Deep Learning.pd https://drive.google.com/file/d/0B6NHiPcsmak1NHJHdzEySzNNQ0U/view
  6. Parallel Multiscale Autoregressive Density Estimation, Scott Reed, Aäron van den Oord, Nal Kalchbrenner, Ziyu Wang, Dan Belov, Nando de Freitas
  7. Count-Based Exploration with Neural Density Models, Georg Ostrovski, Marc Bellemare, Aaron van den Oord, Remi Munos
  8. Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo, Maithra Raghu, Ben Poole, Surya Ganguli, Jon Kleinberg, Jascha Sohl-Dickstein
  9. Johnson - Graphical Models and Deep Learning https://drive.google.com/file/d/0B6NHiPcsmak1RmZ3bmtFWUd5bjA/view?usp=drive_web
  10. Variational Boosting: Iteratively Refining Posterior Approximations, Andrew Miller, Nicholas J Foti, Ryan Adams
  11. Stochastic Generative Hashing, Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, Le Song, ICML17
  12. Robust Structured Estimation with Single-Index Models, ICML17
  13. Learning to Act by Predicting the Future, ICLR17
  14. Improving Generative Adversarial Networks with Denoising Feature Matching, ICLR17
  15. Boosted Generative Models, ICLR17
  16. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables, ICLR17
  17. Robust Probabilistic Modeling with Bayesian Data Reweighting, ICML17
  18. Deep Generative Models for Relational Data with Side Information, ICML17
  19. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, Jiwon Kim ; PMLR 70:1857-1865
  20. Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks, Lars Mescheder, Sebastian Nowozin, Andreas Geiger ; PMLR 70:2391-2400
  21. McGan: Mean and Covariance Feature Matching GAN, Youssef Mroueh, Tom Sercu, Vaibhava Goel ; PMLR 70:2527-2535
  22. Parallel Multiscale Autoregressive Density Estimation, Scott Reed, Aäron Oord, Nal Kalchbrenner, Sergio Gómez Colmenarejo, Ziyu Wang, Yutian Chen, Dan Belov, Nando Freitas ; PMLR 70:2912-2921
  23. Adversarial Feature Matching for Text Generation, Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, Lawrence Carin ; PMLR 70:4006-4015
  24. Learning Hierarchical Features from Deep Generative Models, Shengjia Zhao, Jiaming Song, Stefano Ermon ; PMLR 70:4091-4099
  25. Wasserstein Generative Adversarial Networks, ICML17
  26. Generalization and Equilibrium in Generative Adversarial Nets (GANs), ICML17
  27. Exponential Family Embeddings, NIPS16
  28. Wasserstein GAN, ICML17


  1. Hasselt - Deep Reinforcement Learning - RLSS 2017.pdf https://drive.google.com/file/d/0BzUSSMdMszk6UE5TbWdZekFXSE0/view?usp=drive_web
  2. Pineau - RL Basic Concepts - RLSS 2017.pdf https://drive.google.com/file/d/0BzUSSMdMszk6bjl3eU5CVmU0cWs/view http://videolectures.net/deeplearning2016_pineau_reinforcement_learning/ and http://videolectures.net/deeplearning2016_pineau_advanced_topics/
  3. Roux - RL in the Industry - RLSS 2017.pdf https://drive.google.com/file/d/0BzUSSMdMszk6bEprTUpCaHRrQ28/view
  4. Singh - Steps Towards Continual Learning.pdf https://drive.google.com/file/d/0BzUSSMdMszk6YVhFUUNLZnZLSWs/view?usp=drive_web
  5. Sutton - Temporal-Difference Learning- RLSS 2017.pd https://drive.google.com/file/d/0BzUSSMdMszk6VE9kMkY2SzQzSW8/view?usp=drive_web
  6. Szepesvari - Theory of RL - RLSS 2017.pdf https://drive.google.com/file/d/0BzUSSMdMszk6U194Ym5jSnZQbGM/view?usp=drive_web
  7. Thomas - Safe Reinforcement Learning - RLSS 2017.pdf https://drive.google.com/file/d/0BzUSSMdMszk6TDRMRGRaM0dBcHM/view?usp=drive_web
  8. Minimax Regret Bounds for Reinforcement Learning, Mohammad Gheshlaghi Azar, Ian Osband, Remi Munos
  9. Why is Posterior Sampling Better than Optimism for Reinforcement Learning? Ian Osband, Benjamin Van Roy
  10. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning, Irina Higgins, Arka Pal, Andrei Rusu, Loic Matthey, Chris Burgess, Alexander Pritzel, Matt Botvinick, Charles Blundell, Alexander Lerchner
  11. A Distributional Perspective on Reinforcement Learning, Marc G. Bellemare, Will Dabney, Remi Munos
  12. A Laplacian Framework for Option Discovery in Reinforcement Learning, Marlos Machado (Univ. Alberta), Marc G. Bellemare, Michael Bowling
  13. The Predictron: End-to-End Learning and Planning, David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris
  14. FeUdal Networks for Hierarchical Reinforcement Learning, Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Hees, Max Jaderberg, David Silver, Koray Kavukcuoglu
  15. Neural Episodic Control, Alex Pritzel, Benigno Uria, Sriram Srinivasan, Adria Puigdomenech, Oriol Vinyals, Demis Hassabis, Daan Wierstra, Charles Blundell
  16. Robust Adversarial Reinforcement Learning, Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta
  17. Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs, Michael Gygli, Mohammad Norouzi, Anelia Angelova
  18. Distral: Robust Multitask Reinforcement Learning, https://arxiv.org/pdf/1707.04175.pdf
  19. Reinforcement Learning with Unsupervised Auxiliary Tasks, ICLR17
  20. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic, ICLR17
  21. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning, ICML17
  22. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning, Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli ; PMLR 70:2661-2670
  23. Count-Based Exploration with Neural Density Models, Georg Ostrovski, Marc G. Bellemare, Aäron Oord, Rémi Munos ; PMLR 70:2721-2730
  24. Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction, Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell ; PMLR 70:3309-3318


  1. ICLR 2017 Papers
  2. ICML 2017 Papers
  3. NIPS 2017 papers
  4. Yann Lecun
  5. Y. Bengio
  6. G. Hinton
  7. Juergen Schmidhuber

I: Foundations II: Structures III: Apps IV: Optimiza. V: Generative VI: RL BackTop