2Architecture


Recent Readings Relating to Architectures of Deep Neural Networks (since 2017) (Index of Posts):

No. Read Date Title and Information We Read @
1 2019, Dec, 11 deep2reproduce 2019 Fall - 2Architecture papers 2019-fall Students deep2reproduce
2 2019, Feb, 22 Geometric Deep Learning 2019-W5
3 2018, Dec, 20 Application18- DNN for QA and MedQA 2018-team
4 2018, Oct, 25 Structure18- DNNs Varying Structures 2018-team
5 2018, Oct, 11 Structures18- DNN for Relations 2018-team
6 2018, Aug, 27 Application18- A few DNN for Question Answering 2018-team
7 2018, May, 11 Structures18- DNN for Multiple Label Classification 2018-team
8 2018, May, 3 Structures18- More Attentions 2018-team
9 2017, Oct, 5 Structure VI - DNN with Adaptive Structures 2017-W7
10 2017, Oct, 3 Structure V - DNN with Attention 3 2017-W7
11 2017, Sep, 28 Structure IV - DNN with Attention 2 2017-W6
12 2017, Sep, 26 Structure III - DNN with Attention 2017-W6
13 2017, Sep, 21 Structure II - DNN with Varying Structures 2017-W5
14 2017, Sep, 19 Structure I - Varying DNN structures 2017-W5
15 2017, Jun, 22 Structures17 - Adaptive Deep Networks II 2017-team
16 2017, Jun, 2 Structures17 -Adaptive Deep Networks I 2017-team
17 2017, Jan, 20 Basic16- DNN to be Scalable 2017-team
18 2017, Jan, 18 Basic16- Basic Deep NN with Memory 2017-team
19 2017, Jan, 12 Basic16- Basic DNN Embedding we read for Ranking/QA 2017-team
20 2017, Jan, 12 Basic16- Basic DNN Reads we finished for NLP/Text 2017-team


Here is a detailed list of posts!



[1]: deep2reproduce 2019 Fall - 2Architecture papers


structured CNN RNN loss
Team INDEX Title & Link Tags Our Slide  
T5 Deep Structured Prediction with Nonlinear Output Transformations   structured OurSlide
T12 Large Margin Deep Networks for Classification OurSlide large-margin  
T15 Wide Activation for Efficient and Accurate Image Super-Resolution CNN OurSlide  
T17 Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks RNN OurSlide  
T28 Processing of missing data by neural networks imputation OurSlide  
T27 Implicit Acceleration by Overparameterization analysis OurSlide  

[2]: Geometric Deep Learning


geometric graph matching dynamic manifold invariant
Presenter Papers Paper URL Our Slides
spherical Spherical CNNs Pdf Fuwen PDF + Arshdeep Pdf
dynamic Dynamic graph cnn for learning on point clouds, 2018 Pdf Fuwen PDF
basics Geometric Deep Learning (simple introduction video) URL  
matching All Graphs Lead to Rome: Learning Geometric and Cycle-Consistent Representations with Graph Convolutional Networks Pdf Fuwen PDF
completion Geometric matrix completion with recurrent multi-graph neural networks Pdf Fuwen PDF
Tutorial Geometric Deep Learning on Graphs and Manifolds URL Arsh PDF
matching Similarity Learning with Higher-Order Proximity for Brain Network Analysis   Arsh PDF
pairwise Pixel to Graph with Associative Embedding PDF Fuwen PDF
3D 3D steerable cnns: Learning rotationally equivariant features in volumetric data URL Fuwen PDF

[3]: Application18- DNN for QA and MedQA


seq2seq recommendation QA graph relational EHR
Presenter Papers Paper URL Our Slides
Bill Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning PDF PDF
Chao Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis (I) PDF PDF
Chao Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis (II) PDF PDF
Derrick Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis (III) PDF PDF
Chao Reading Wikipedia to Answer Open-Domain Questions PDF PDF
Jennifer Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text PDF PDF

[4]: Structure18- DNNs Varying Structures


Architecture-Search Hyperparameter dynamic
Presenter Papers Paper URL Our Slides
Arshdeep Learning Transferable Architectures for Scalable Image Recognition PDF PDF
Arshdeep FractalNet: Ultra-Deep Neural Networks without Residuals PDF PDF

[5]: Structures18- DNN for Relations


relational InfoMax
Presenter Papers Paper URL Our Slides
Arshdeep Relational inductive biases, deep learning, and graph networks PDF PDF
Arshdeep Discriminative Embeddings of Latent Variable Models for Structured Data PDF PDF
Jack Deep Graph Infomax PDF PDF

[6]: Application18- A few DNN for Question Answering


trees metric-learning embedding QA
Presenter Papers Paper URL Our Slides
Derrick GloVe: Global Vectors for Word Representation PDF PDF
Derrick PARL.AI: A unified platform for sharing, training and evaluating dialog models across many tasks. URL PDF
Derrick scalable nearest neighbor algorithms for high dimensional data (PAMI14) 1 PDF PDF
Derrick StarSpace: Embed All The Things! PDF PDF
Derrick Weaver: Deep Co-Encoding of Questions and Documents for Machine Reading, Martin Raison, Pierre-Emmanuel Mazaré, Rajarshi Das, Antoine Bordes PDF PDF

[7]: Structures18- DNN for Multiple Label Classification


multi-label structured Adversarial-loss attention RNN
Presenter Papers Paper URL Our Slides
Chao Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification PDF PDF
Jack FastXML: A Fast, Accurate and Stable Tree-classifier for eXtreme Multi-label Learning PDF PDF
BasicMLC Multi-Label Classification: An Overview PDF  
SPEN Structured Prediction Energy Networks PDF  
InfNet Learning Approximate Inference Networks for Structured Prediction PDF  
SPENMLC Deep Value Networks PDF  
Adversarial Semantic Segmentation using Adversarial Networks PDF  
EmbedMLC StarSpace: Embed All The Things! PDF  
deepMLC CNN-RNN: A Unified Framework for Multi-label Image Classification/ CVPR 2016 PDF  
deepMLC Order-Free RNN with Visual Attention for Multi-Label Classification / AAAI 2018 PDF  

[8]: Structures18- More Attentions


attention relational Variational
Presenter Papers Paper URL Our Slides
Arshdeep Show, Attend and Tell: Neural Image Caption Generation with Visual Attention 1 PDF PDF
Arshdeep Latent Alignment and Variational Attention 2 PDF PDF
Arshdeep Modularity Matters: Learning Invariant Relational Reasoning Tasks, Jason Jo, Vikas Verma, Yoshua Bengio 3 PDF PDF

[9]: Structure VI - DNN with Adaptive Structures


dynamic Architecture Search structured
Presenter Papers Paper URL Our Slides
Anant AdaNet: Adaptive Structural Learning of Artificial Neural Networks, ICML17 1 PDF PDF
Shijia SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization, ICML17 2 PDF PDF
Jack Proximal Deep Structured Models, NIPS16 3 PDF PDF
  Optimal Architectures in a Solvable Model of Deep Networks, NIPS16 4 PDF  
Tianlu Large-Scale Evolution of Image Classifiers, ICML17 5 PDF PDF

[10]: Structure V - DNN with Attention 3


dynamic QA memory
Presenter Papers Paper URL Our Slides
Tianlu Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, ICML17 1 PDF + code PDF
Jack Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17 2 PDF PDF
Xueying State-Frequency Memory Recurrent Neural Networks, ICML17 3 PDF PDF

[11]: Structure IV - DNN with Attention 2


attention transfer-learning relational generative memory Infomax
Presenter Papers Paper URL Our Slides
Jack Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain, ICLR17 1 PDF PDF
Arshdeep Bidirectional Attention Flow for Machine Comprehension, ICLR17 2 PDF + code PDF
Ceyer Image-to-Markup Generation with Coarse-to-Fine Attention, ICML17 PDF + code PDF
ChaoJiang Can Active Memory Replace Attention? ; Samy Bengio, NIPS16 3 PDF PDF
  An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax, ICLR17 PDF  

[12]: Structure III - DNN with Attention


attention transfer-learning dynamic structured QA relational
Presenter Papers Paper URL Our Slides
Rita Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, ICLR17 1 PDF PDF
Tianlu Dynamic Coattention Networks For Question Answering, ICLR17 2 PDF + code PDF
ChaoJiang Structured Attention Networks, ICLR17 3 PDF + code PDF

[13]: Structure II - DNN with Varying Structures


sparsity blocking nonparametric structured QA Interpretable
Presenter Papers Paper URL Our Slides
Shijia Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer, (Dean), ICLR17 1 PDF PDF
Ceyer Sequence Modeling via Segmentations, ICML17 2 PDF PDF
Arshdeep Input Switched Affine Networks: An RNN Architecture Designed for Interpretability, ICML17 3 PDF PDF

[14]: Structure I - Varying DNN structures


dialog QA nonparametric structured sparsity
Presenter Papers Paper URL Our Slides
Jack Learning End-to-End Goal-Oriented Dialog, ICLR17 1 PDF PDF
Bargav Nonparametric Neural Networks, ICLR17 2 PDF PDF
Bargav Learning Structured Sparsity in Deep Neural Networks, NIPS16 3 PDF PDF
Arshdeep Learning the Number of Neurons in Deep Networks, NIPS16 4 PDF PDF

[15]: Structures17 - Adaptive Deep Networks II


low-rank binary dynamic learn2learn optimization
Presenter Papers Paper URL Our Slides
Arshdeep Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction 1 PDF PDF
Arshdeep Decoupled Neural Interfaces Using Synthetic Gradients 2 PDF PDF
Arshdeep Diet Networks: Thin Parameters for Fat Genomics 3 PDF PDF
Arshdeep Metric Learning with Adaptive Density Discrimination 4 PDF PDF

[16]: Structures17 -Adaptive Deep Networks I


low-rank binary dynamic learn2learn optimization
Presenter Papers Paper URL Our Slides
Arshdeep HyperNetworks, David Ha, Andrew Dai, Quoc V. Le ICLR 2017 1 PDF PDF
Arshdeep Learning feed-forward one-shot learners 2 PDF PDF
Arshdeep Learning to Learn by gradient descent by gradient descent 3 PDF PDF
Arshdeep Dynamic Filter Networks 4 https://arxiv.org/abs/1605.09673 PDF PDF

[17]: Basic16- DNN to be Scalable


scalable random sparsity binary hash compression low-rank distributed dimension reduction pruning sketch Parallel
Presenter Papers Paper URL Our Slides
scalable Sanjiv Kumar (Columbia EECS 6898), Lecture: Introduction to large-scale machine learning 2010 [^1] PDF  
data scalable Alex Smola - Berkeley SML: Scalable Machine Learning: Syllabus 2012 [^2] PDF 2014 + PDF  
Binary Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1    
Model Binary embeddings with structured hashed projections 1 PDF PDF
Model Deep Compression: Compressing Deep Neural Networks (ICLR 2016) 2 PDF PDF

[18]: Basic16- Basic Deep NN with Memory


memory NTM seq2seq pointer set attention meta-learning Few-Shot matching net metric-learning
Presenter Papers Paper URL Our Slides
seq2seq Sequence to Sequence Learning with Neural Networks PDF  
Set Pointer Networks PDF  
Set Order Matters: Sequence to Sequence for Sets PDF  
Point Attention Multiple Object Recognition with Visual Attention PDF  
Memory End-To-End Memory Networks PDF Jack Survey
Memory Neural Turing Machines PDF  
Memory Hybrid computing using a neural network with dynamic external memory PDF  
Muthu Matching Networks for One Shot Learning (NIPS16) 1 PDF PDF
Jack Meta-Learning with Memory-Augmented Neural Networks (ICML16) 2 PDF PDF
Metric ICML07 Best Paper - Information-Theoretic Metric Learning PDF  

[19]: Basic16- Basic DNN Embedding we read for Ranking/QA


embedding recommendation QA graph relational
Presenter Papers Paper URL Our Slides
QA Learning to rank with (a lot of) word features PDF  
Relation A semantic matching energy function for learning with multi-relational data PDF  
Relation Translating embeddings for modeling multi-relational data PDF  
QA Reading wikipedia to answer open-domain questions PDF  
QA Question answering with subgraph embeddings PDF  

[20]: Basic16- Basic DNN Reads we finished for NLP/Text


embedding text BERT seq2seq attention NLP curriculum BackProp relational
Presenter Papers Paper URL Our Slides
NLP A Neural Probabilistic Language Model PDF  
Text Bag of Tricks for Efficient Text Classification PDF  
Text Character-level Convolutional Networks for Text Classification PDF  
NLP BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding PDF  
seq2seq Neural Machine Translation by Jointly Learning to Align and Translate PDF  
NLP Natural Language Processing (almost) from Scratch PDF  
Train Curriculum learning PDF  
Muthu NeuroIPS Embedding Papers survey 2012 to 2015 NIPS PDF
Basics Efficient BackProp PDF  



Here is a name list of posts!


Geometric Deep Learning

less than 1 minute read

Presenter Papers Paper URL Our Slides spherical Spherical CNNs Pdf Fuwen PDF + Arsh...

Basic16- DNN to be Scalable

4 minute read

Presenter Papers Paper URL Our Slides scalable Sanjiv Kumar (Columbia EECS 6898), Lecture: Intro...