|Presenter||Papers||Paper URL||Our Slides|
|AE||Intriguing properties of neural networks /|
|AE||Explaining and Harnessing Adversarial Examples|
|AE||Towards Deep Learning Models Resistant to Adversarial Attacks|
|AE||DeepFool: a simple and accurate method to fool deep neural networks|
|AE||Towards Evaluating the Robustness of Neural Networks by Carlini and Wagner|
|Data||Basic Survey of ImageNet - LSVRC competition||URL|
|Understand||Understanding Black-box Predictions via Influence Functions|
|Understand||Deep inside convolutional networks: Visualising image classification models and saliency maps|
|Understand||BeenKim, Interpretable Machine Learning, ICML17 Tutorial [^1]|
|provable||Provable defenses against adversarial examples via the convex outer adversarial polytope, Eric Wong, J. Zico Kolter,||URL|
[^1] Notes about Interpretable Machine Learning
Notes of Interpretability in Machine Learning from Been Kim Tutorial
by Brandon Liu
Important Criteria in ML Systems
- Avoiding technical debt
- Providing the right to explanation
- Ex. Self driving cars and other autonomous vehicles - almost impossible to come up with all possible unit tests.
What is interpretability?
- The ability to give explanations to humans.
Two Branches of Interpretability
- In the context of an application: if the system is useful in either a practical application or a simplified version of it, then it must be somehow interpretable.
- Via a quantifiable proxy: a researcher might first claim that some model class—e.g. sparse linear models, rule lists, gradient boosted trees—are interpretable and then present algorithms to optimize within that class.
Before building any model
- Exploratory data analysis
Building a new model
- Rule-based, per-feature-based
After building a model
- Sensitivity analysis, gradient-based methods
- mimic/surrogate models
- Investigation on hidden layers