Recent Readings for RL and Deep RL (since 2017)


6Reinforcement (Index of Posts):

No. Read Date Title and Information We Read @
1 2019, Dec, 8 deep2reproduce 2019 Fall - 6Reinforcement papers 2019-fall Students deep2reproduce
2 2018, Aug, 13 Application18- DNNs in a Few BioMedical Tasks 2018-team
3 2018, Aug, 3 Reliable18- Testing and Verifying DNNs 2018-team
4 2017, Nov, 30 RL IV - RL with varying structures 2017-W15
5 2017, Nov, 28 RL III - Basic tutorial RLSS17 (2) 2017-W14
6 2017, Nov, 21 RL II - Basic tutorial RLSS17 2017-W14
7 2017, Aug, 29 Reinforcement I - Pineau - RL Basic Concepts 2017-W2

[1]: Reinforcement I - Pineau - RL Basic Concepts

6Reinforcement 0Basics RL

Pineau - RL Basic Concepts

Presenter Papers Paper URL Our Slides
DLSS16 video    
RLSS17 slideRaw + video+ slide    

[2]: RL II - Basic tutorial RLSS17

6Reinforcement RL Multi-Task
Presenter Papers Paper URL Our Slides
Jack Hasselt - Deep Reinforcement Learning RLSS17.pdf + video PDF
Tianlu Roux - RL in the Industry RLSS17.pdf + video PDF / PDF-Bandit
Xueying Singh - Steps Towards Continual Learning pdf + video PDF
GaoJi Distral: Robust Multitask Reinforcement Learning 1 PDF PDF

[3]: RL III - Basic tutorial RLSS17 (2)

6Reinforcement alphaGO Planning Temporal-Difference
Presenter Papers Paper URL Our Slides
Anant The Predictron: End-to-End Learning and Planning, ICLR17 1 PDF PDF
ChaoJiang Szepesvari - Theory of RL 2 RLSS.pdf + Video PDF
GaoJi Mastering the game of Go without human knowledge / Nature 2017 3 PDF PDF
  Thomas - Safe Reinforcement Learning RLSS17.pdf + video  
  Sutton - Temporal-Difference Learning RLSS17.pdf + Video  

[4]: RL IV - RL with varying structures

6Reinforcement Auxiliary Sampling Value-Networks structured Imitation-Learning Hierarchical
Presenter Papers Paper URL Our Slides
Ceyer Reinforcement Learning with Unsupervised Auxiliary Tasks, ICLR17 1 PDF PDF
Beilun Why is Posterior Sampling Better than Optimism for Reinforcement Learning? Ian Osband, Benjamin Van Roy 2 PDF PDF
Ji Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction, ICML17 3 PDF PDF
Xueying End-to-End Differentiable Adversarial Imitation Learning, ICML17 4 PDF PDF
  Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs, ICML17 PDF  
  FeUdal Networks for Hierarchical Reinforcement Learning, ICML17 5 PDF  

[5]: Reliable18- Testing and Verifying DNNs

3Reliable 6Reinforcement RL Fuzzing Adversarial-Examples verification software-testing black-box white-box
Presenter Papers Paper URL Our Slides
GaoJi Deep Reinforcement Fuzzing, Konstantin Böttinger, Patrice Godefroid, Rishabh Singh PDF PDF
GaoJi Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks, Guy Katz, Clark Barrett, David Dill, Kyle Julian, Mykel Kochenderfer PDF PDF
GaoJi DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars, Yuchi Tian, Kexin Pei, Suman Jana, Baishakhi Ray PDF PDF
GaoJi A few Recent (2018) papers on Black-box Adversarial Attacks, like Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors 1 PDF PDF
GaoJi A few Recent papers of Adversarial Attacks on reinforcement learning, like Adversarial Attacks on Neural Network Policies (Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel) PDF PDF
Testing DeepXplore: Automated Whitebox Testing of Deep Learning Systems PDF  

[6]: Application18- DNNs in a Few BioMedical Tasks

3Reliable 6Reinforcement 9DiscreteApp brain RNA DNA Genomics generative
Presenter Papers Paper URL Our Slides
Arshdeep DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. PDF PDF
Arshdeep Solving the RNA design problem with reinforcement learning, PLOSCB 1 PDF PDF
Arshdeep Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk 2 PDF PDF
Arshdeep Towards Gene Expression Convolutions using Gene Interaction Graphs, Francis Dutil, Joseph Paul Cohen, Martin Weiss, Georgy Derevyanko, Yoshua Bengio 3 PDF PDF
Brandon Kipoi: Accelerating the Community Exchange and Reuse of Predictive Models for Genomics PDF PDF
Arshdeep Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions 2 PDF PDF

[7]: deep2reproduce 2019 Fall - 6Reinforcement papers

6Reinforcement verification RL
Team INDEX Title & Link Tags Our Slide
T1 Safe Reinforcement Learning via Shielding RL, safety, verification OurSlide
BackTop