3Reliable


Recent Readings for Trustworthy Properties of Deep Neural Networks (since 2017) (Index of Posts):

No. Read Date Title and Information We Read @
1 2020, Aug, 5 Interpretable Deep Learning 2020-W8
2 2020, Jul, 5 Trustworthy Deep Learning 2020-W7
3 2019, Dec, 10 deep2reproduce 2019 Fall - 3Reliable papers 2019-fall Students deep2reproduce
4 2019, Apr, 5 GNN to Understand 2019-W12
5 2019, Mar, 6 GNN Robustness 2019-W7
6 2018, Dec, 2 Reliable18- Adversarial Attacks and DNN 2018-team
7 2018, Nov, 20 Reliable18- Adversarial Attacks and DNN 2018-team
8 2018, Oct, 12 Reliable18- Understand DNNs 2018-team
9 2018, Aug, 13 Application18- DNNs in a Few BioMedical Tasks 2018-team
10 2018, Aug, 3 Reliable18- Testing and Verifying DNNs 2018-team
11 2018, May, 20 Reliable18- Adversarial Attacks and DNN and More 2018-team
12 2018, May, 12 Reliable18- Adversarial Attacks and DNN 2018-team
13 2018, Jan, 10 Application18- Property of DeepNN Models and Discrete tasks 2018-team
14 2017, Oct, 26 Reliable Applications VI - Robustness2 2017-W10
15 2017, Oct, 23 Reliable Applications IV - Robustness 2017-W9
16 2017, Oct, 17 Reliable Applications III - Interesting Tasks 2017-W9
17 2017, Oct, 12 Reliable Applications II - Data privacy 2017-W8
18 2017, Oct, 11 Reliable Applications V - Understanding2 2017-W10
19 2017, Oct, 10 Reliable Applications I - Understanding 2017-W8
20 2017, Jul, 22 Reliable17-Testing and Machine Learning Basics 2017-team
21 2017, Feb, 22 Reliable17-Secure Machine Learning 2017-team
22 2017, Jan, 19 Basic16- Basic Deep NN and Robustness 2017-team


Here is a detailed list of posts!



[1]: Interpretable Deep Learning


Interpretable black-box casual attention shapley concept
Index Papers Our Slides
0 A survey on Interpreting Deep Learning Models Eli Survey
  Interpretable Machine Learning: Definitions,Methods, Applications Arsh Survey
1 Explaining Explanations: Axiomatic Feature Interactions for Deep Networks Arsh Survey
2 Shapley Value review Arsh Survey
  L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data Bill Survey
  Consistent Individualized Feature Attribution for Tree Ensembles bill Survey
  Summary for A value for n-person games Pan Survey
  L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data Rishab Survey
3 Hierarchical Interpretations of Neural Network Predictions Arsh Survey
  Hierarchical Interpretations of Neural Network Predictions Rishab Survey
4 Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs Arsh Survey
  Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs Rishab Survey
5 Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models Rishab Survey
    Sanchit Survey
  Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection Sanchit Survey
6 This Looks Like That: Deep Learning for Interpretable Image Recognition Pan Survey
7 AllenNLP Interpret Rishab Survey
8 DISCOVERY OF NATURAL LANGUAGE CONCEPTS IN INDIVIDUAL UNITS OF CNNs Rishab Survey
9 How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations Rishab Survey
10 Attention is not Explanation Sanchit Survey
    Pan Survey
11 Axiomatic Attribution for Deep Networks Sanchit Survey
12 Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Sanchit Survey
13 Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifier Sanchit Survey
14 “Why Should I Trust You?”Explaining the Predictions of Any Classifier Yu Survey
15 INTERPRETATIONS ARE USEFUL: PENALIZING EXPLANATIONS TO ALIGN NEURAL NETWORKS WITH PRIOR KNOWLEDGE Pan Survey

[2]: Trustworthy Deep Learning


bias data valuation robustness adversarial-examples regularization
Index Papers Our Slides
1 BIAS ALSO MATTERS: BIAS ATTRIBUTION FOR DEEP NEURAL NETWORK EXPLANATION Arsh Survey
2 Data Shapley: Equitable Valuation of Data for Machine Learning Arsh Survey
  What is your data worth? Equitable Valuation of Data Sanchit Survey
3 Neural Network Attributions: A Causal Perspective Zhe Survey
4 Defending Against Neural Fake News Eli Survey
5 Interpretation of Neural Networks is Fragile Eli Survey
  Interpretation of Neural Networks is Fragile Pan Survey
6 Parsimonious Black-Box Adversarial Attacks Via Efficient Combinatorial Optimization Eli Survey
7 Retrofitting Word Vectors to Semantic Lexicons Morris Survey
8 On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models Morris Survey
9 Towards Deep Learning Models Resistant to Adversarial Attacks Pan Survey
10 Robust Attribution Regularization Pan Survey
11 Sanity Checks for Saliency Maps Sanchit Survey
12 Survey of data generation and evaluation in Interpreting DNN pipelines Sanchit Survey
13 Think Architecture First: Benchmarking Deep Learning Interpretability in Time Series Predictions Sanchit Survey
14 Universal Adversarial Triggers for Attacking and Analyzing NLP Sanchit Survey
15 Apricot: Submodular selection for data summarization in Python Arsh Survey

[3]: deep2reproduce 2019 Fall - 3Reliable papers


submodular safety adversarial-examples robustness model-as-sample privacy Attribution Relational
Team INDEX Title & Link Tags Our Slide
T3 Deletion-Robust Submodular Maximization: Data Summarization with Privacy and Fairness Constraints submodular, coreset, safety OurSlide
T6 Decision Boundary Analysis of Adversarial Examples adversarial-examples OurSlide
T8 Robustness may be at odds with accuracy robustness OurSlide
T18 Towards Reverse-Engineering Black-Box Neural Networks meta, model-as-sample, safety, privacy OurSlide
T23 The Odds are Odd: A Statistical Test for Detecting Adversarial Examples adversarial-examples OurSlide
T25 Learning how to explain neural networks: PatternNet and PatternAttribution Attribution, Interpretable OurSlide
T31 Detecting Statistical Interactions from Neural Network Weights Interpretable, Relational OurSlide

[4]: GNN to Understand


Interpretable black-box casual seq2seq noise knowledge-graph attention
Presenter Papers Paper URL Our Slides
Understand Faithful and Customizable Explanations of Black Box Models Pdf Derrick PDF
Understand A causal framework for explaining the predictions of black-box sequence-to-sequence models, EMNLP17 Pdf GaoJi PDF + Bill Pdf
Understand How Powerful are Graph Neural Networks? / Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning Pdf + Pdf GaoJi PDF
Understand Interpretable Graph Convolutional Neural Networks for Inference on Noisy Knowledge Graphs + GNN Explainer: A Tool for Post-hoc Explanation of Graph Neural Networks Pdf + PDF GaoJi PDF
Understand Attention is not Explanation, 2019 PDF  
Understand Understanding attention in graph neural networks, 2019 PDF  

[5]: GNN Robustness


graph structured Adversarial-Examples binary
Presenter Papers Paper URL Our Slides
Robust Adversarial Attacks on Graph Structured Data Pdf Faizan [PDF + GaoJi Pdf
Robust KDD’18 Adversarial Attacks on Neural Networks for Graph Data Pdf Faizan PDF + GaoJi Pdf
Robust Attacking Binarized Neural Networks Pdf Faizan PDF

[6]: Reliable18- Adversarial Attacks and DNN


Adversarial-Examples visualizing Interpretable EHR NLP
Presenter Papers Paper URL Our Slides
Jennifer Adversarial Attacks Against Medical Deep Learning Systems PDF PDF
Jennifer Adversarial-Playground: A Visualization Suite Showing How Adversarial Examples Fool Deep Learning PDF PDF
Jennifer Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers PDF PDF
Jennifer CleverHans PDF PDF
Ji Ji-f18-New papers about adversarial attack   PDF

[7]: Reliable18- Adversarial Attacks and DNN


Adversarial-Examples software-testing Interpretable distillation
Presenter Papers Paper URL Our Slides
Bill Adversarial Examples that Fool both Computer Vision and Time-Limited Humans PDF PDF
Bill Adversarial Attacks Against Medical Deep Learning Systems PDF PDF
Bill TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing PDF PDF
Bill Distilling the Knowledge in a Neural Network PDF PDF
Bill Defensive Distillation is Not Robust to Adversarial Examples PDF PDF
Bill Adversarial Logit Pairing , Harini Kannan, Alexey Kurakin, Ian Goodfellow PDF PDF

[8]: Reliable18- Understand DNNs


visualizing interpretable Attribution GAN understanding
Presenter Papers Paper URL Our Slides
Jack A Unified Approach to Interpreting Model Predictions PDF PDF
Jack “Why Should I Trust You?”: Explaining the Predictions of Any Classifier PDF PDF
Jack Visual Feature Attribution using Wasserstein GANs PDF PDF
Jack GAN Dissection: Visualizing and Understanding Generative Adversarial Networks PDF PDF
GaoJi Recent Interpretable machine learning papers PDF PDF
Jennifer The Building Blocks of Interpretability PDF PDF

[9]: Application18- DNNs in a Few BioMedical Tasks


brain RNA DNA Genomics generative
Presenter Papers Paper URL Our Slides
Arshdeep DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. PDF PDF
Arshdeep Solving the RNA design problem with reinforcement learning, PLOSCB 1 PDF PDF
Arshdeep Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk 2 PDF PDF
Arshdeep Towards Gene Expression Convolutions using Gene Interaction Graphs, Francis Dutil, Joseph Paul Cohen, Martin Weiss, Georgy Derevyanko, Yoshua Bengio 3 PDF PDF
Brandon Kipoi: Accelerating the Community Exchange and Reuse of Predictive Models for Genomics PDF PDF
Arshdeep Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions 2 PDF PDF

[10]: Reliable18- Testing and Verifying DNNs


RL Fuzzing Adversarial-Examples verification software-testing black-box white-box
Presenter Papers Paper URL Our Slides
GaoJi Deep Reinforcement Fuzzing, Konstantin Böttinger, Patrice Godefroid, Rishabh Singh PDF PDF
GaoJi Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks, Guy Katz, Clark Barrett, David Dill, Kyle Julian, Mykel Kochenderfer PDF PDF
GaoJi DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars, Yuchi Tian, Kexin Pei, Suman Jana, Baishakhi Ray PDF PDF
GaoJi A few Recent (2018) papers on Black-box Adversarial Attacks, like Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors 1 PDF PDF
GaoJi A few Recent papers of Adversarial Attacks on reinforcement learning, like Adversarial Attacks on Neural Network Policies (Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel) PDF PDF
Testing DeepXplore: Automated Whitebox Testing of Deep Learning Systems PDF  

[11]: Reliable18- Adversarial Attacks and DNN and More


seq2seq Adversarial-Examples Certified-Defense
Presenter Papers Paper URL Our Slides
Bill Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples PDF PDF
Bill Adversarial Examples for Evaluating Reading Comprehension Systems, Robin Jia, Percy Liang PDF PDF
Bill Certified Defenses against Adversarial Examples, Aditi Raghunathan, Jacob Steinhardt, Percy Liang PDF PDF
Bill Provably Minimally-Distorted Adversarial Examples, Nicholas Carlini, Guy Katz, Clark Barrett, David L. Dill PDF PDF

[12]: Reliable18- Adversarial Attacks and DNN


Adversarial-Examples generative Interpretable
Presenter Papers Paper URL Our Slides
Bill Intriguing Properties of Adversarial Examples, Ekin D. Cubuk, Barret Zoph, Samuel S. Schoenholz, Quoc V. Le 1 PDF PDF
Bill Adversarial Spheres 2 PDF PDF
Bill Adversarial Transformation Networks: Learning to Generate Adversarial Examples, Shumeet Baluja, Ian Fischer 3 PDF PDF
Bill Thermometer encoding: one hot way to resist adversarial examples 4 PDF PDF
  Adversarial Logit Pairing , Harini Kannan, Alexey Kurakin, Ian Goodfellow 5 PDF  

[13]: Application18- Property of DeepNN Models and Discrete tasks


embedding generative NLP generalization NLP
Presenter Papers Paper URL Our Slides
Bill Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation 1 PDF PDF
Bill Measuring the tendency of CNNs to Learn Surface Statistical Regularities Jason Jo, Yoshua Bengio PDF PDF
Bill Generating Sentences by Editing Prototypes, Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren, Percy Liang 2 PDF PDF
Bill On the importance of single directions for generalization, Ari S. Morcos, David G.T. Barrett, Neil C. Rabinowitz, Matthew Botvinick PDF PDF

[14]: Reliable Applications VI - Robustness2


Adversarial-Examples noise Composition robustness
Presenter Papers Paper URL Our Slides
Tianlu Robustness of classifiers: from adversarial to random noise, NIPS16 PDF 1 PDF
Anant Blind Attacks on Machine Learners, 2 NIPS16 PDF PDF
  Data Noising as Smoothing in Neural Network Language Models (Ng), ICLR17 3 pdf  
  The Robustness of Estimator Composition, NIPS16 4 PDF  

[15]: Reliable Applications IV - Robustness


Adversarial-Examples high-dimensional robustness
Presenter Papers Paper URL Our Slides
GaoJi Delving into Transferable Adversarial Examples and Black-box Attacks,ICLR17 1 pdf PDF
Shijia On Detecting Adversarial Perturbations, ICLR17 2 pdf PDF
Anant Parseval Networks: Improving Robustness to Adversarial Examples, ICML17 3 pdf PDF
Bargav Being Robust (in High Dimensions) Can Be Practical, ICML17 4 pdf PDF

[16]: Reliable Applications III - Interesting Tasks


QA noise Neural-Programming Hierarchical
Presenter Papers Paper URL Our Slides
Jack Learning to Query, Reason, and Answer Questions On Ambiguous Texts, ICLR17 1 PDF PDF
Arshdeep Making Neural Programming Architectures Generalize via Recursion, ICLR17 2 PDF PDF
Xueying Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music, ICLR17 3 PDF PDF

[17]: Reliable Applications II - Data privacy


Semi-supervised Privacy Domain-adaptation
Presenter Papers Paper URL Our Slides
Xueying Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR17 1 PDF PDF
Bargav Deep Learning with Differential Privacy, CCS16 2 PDF + video PDF
Bargav Privacy-Preserving Deep Learning, CCS15 3 PDF PDF
Xueying Domain Separation Networks, NIPS16 4 PDF PDF

[18]: Reliable Applications V - Understanding2


visualizing Difference-Analysis Attribution Composition
Presenter Papers Paper URL Our Slides
Rita Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, ICLR17 1 PDF PDF
Arshdeep Axiomatic Attribution for Deep Networks, ICML17 2 PDF PDF
  The Robustness of Estimator Composition, NIPS16 PDF  

[19]: Reliable Applications I - Understanding


Interpretable Model-Criticism random Difference-Analysis Attribution
Presenter Papers Paper URL Our Slides
Rita Learning Important Features Through Propagating Activation Differences, ICML17 1 PDF PDF
GaoJi Examples are not Enough, Learn to Criticize! Model Criticism for Interpretable Machine Learning, NIPS16 2 PDF PDF
Rita Learning Kernels with Random Features, Aman Sinha*; John Duchi, 3 PDF PDF

[20]: Reliable17-Testing and Machine Learning Basics


software-testing white-box black-box robustness Metamorphic Influence Functions
Presenter Papers Paper URL Our Slides
GaoJi A few useful things to know about machine learning PDF PDF
GaoJi A few papers related to testing learning, e.g., Understanding Black-box Predictions via Influence Functions PDF PDF
GaoJi Automated White-box Testing of Deep Learning Systems 1 PDF PDF
GaoJi Testing and Validating Machine Learning Classifiers by Metamorphic Testing 2 PDF PDF
GaoJi Software testing: a research travelogue (2000–2014) PDF PDF

[21]: Reliable17-Secure Machine Learning


secure Privacy Cryptography
Presenter Papers Paper URL Our Slides
Tobin Summary of A few Papers on: Machine Learning and Cryptography, (e.g., learning to Protect Communications with Adversarial Neural Cryptography) 1 PDF PDF
Tobin Privacy Aware Learning (NIPS12) 2 PDF PDF
Tobin Can Machine Learning be Secure?(2006) PDF PDF

[22]: Basic16- Basic Deep NN and Robustness


Adversarial-Examples robustness visualizing Interpretable Certified-Defense
Presenter Papers Paper URL Our Slides
AE Intriguing properties of neural networks / PDF  
AE Explaining and Harnessing Adversarial Examples PDF  
AE Towards Deep Learning Models Resistant to Adversarial Attacks PDF  
AE DeepFool: a simple and accurate method to fool deep neural networks PDF  
AE Towards Evaluating the Robustness of Neural Networks by Carlini and Wagner PDF PDF
Data Basic Survey of ImageNet - LSVRC competition URL PDF
Understand Understanding Black-box Predictions via Influence Functions PDF  
Understand Deep inside convolutional networks: Visualising image classification models and saliency maps PDF  
Understand BeenKim, Interpretable Machine Learning, ICML17 Tutorial [^1] PDF  
provable Provable defenses against adversarial examples via the convex outer adversarial polytope, Eric Wong, J. Zico Kolter, URL  



Here is a name list of posts!


Trustworthy Deep Learning

less than 1 minute read

Index Papers Our Slides 1 BIAS ALSO MATTERS: BIAS ATTRIBUTION FOR DEEP NEURAL NETWORK EXPLANATION ...

GNN to Understand

less than 1 minute read

Presenter Papers Paper URL Our Slides Understand Faithful and Customizable Explanations of Black...

GNN Robustness

less than 1 minute read

Presenter Papers Paper URL Our Slides Robust Adversarial Attacks on Graph Structured Data ...

Reliable18- Understand DNNs

less than 1 minute read

Presenter Papers Paper URL Our Slides Jack A Unified Approach to Interpreting Model Predictions ...