deep2reproduce 2019 Fall - 1Analysis papers

less than 1 minute read

Team INDEX	Title & Link	Tags	Our Slide
T2	Empirical Study of Example Forgetting During Deep Neural Network Learning	Sample Selection, forgetting	OurSlide
T29	Select Via Proxy: Efficient Data Selection For Training Deep Networks	Sample Selection	OurSlide
T9	How SGD Selects the Global Minima in over-parameterized Learning	optimization	OurSlide
T10	Escaping Saddles with Stochastic Gradients	optimization	OurSlide
T13	To What Extent Do Different Neural Networks Learn the Same Representation	subspace	OurSlide
T19	On the Information Bottleneck Theory of Deep Learning	informax	OurSlide
T20	Visualizing the Loss Landscape of Neural Nets	normalization	OurSlide
T21	Using Pre-Training Can Improve Model Robustness and Uncertainty	training, analysis	OurSlide
T24	Norm matters: efficient and accurate normalization schemes in deep networks	normalization	OurSlide

Twitter Facebook LinkedIn

Safety Benchmark WMDP

1 minute read

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatt...

KV Cache and Tooling

3 minute read

KV Caching in LLM:

Advanced Transformer Architectures

25 minute read

In this session, our readings cover:

LLM fine tuning

29 minute read