Deep Learning's Generalization, Especially on structured discrete data

This front adapts from our legacy website deeplearning4discrete.net and introduces a suite of deep learning tools we have developed for improve deep learning generalization, especially when on discrete structured data types like text, graph, or sets. Please feel free to email me when you find my typos.

Background on why Generalization topics of Deep Learning are interesting?

Generalization refers to how a machine model adapts properly to new, previously unseen data. We focus on OOD (out of distribution) generalization.

timeline

Why structured discrete Data is Interesting?

Deep learning constructs networks of parameterized functional modules and is trained from reference examples using gradient-based optimization [Lecun19].

Since it is hard to estimate gradients through functions of discrete random variables, researching on how to make deep learning behave well on discrete structured data and structured representation interests us. Developing such techniques are an active research area. We focus on investigating interpretable and scalable techniques for doing so.

Relevant Papers we published

We can use a component-view to categorize the research topics in OOD generalization:
- (1) sample level
- (2) feature level
- (3) representation/encoding level
- (4) loss level
- (5) task level (e.g., meta learning, few shot generalization)
- Please check out each item in our side-bar

timeline

Contacts:

Have questions or suggestions? Feel free to ask me on Twitter or email me.

Thanks for reading!

Zhe’s PhD Defense - Toward Out-Of-Distribution Generalization Of Deep Learning Models

2 minute read

Ph.D. Dissertation Defense by Zhe Wang, Tues., 04/02/24, at 12:00PM (ET) Committee:

Arsh’s PhD Defense - Relational Structure Discovery for Deep Learning

1 minute read

Arshdeep Sekhon’s PhD Defense June 29, 2022.

JackL’s PhD Defense - Modeling interactions with Deep Learning

less than 1 minute read

Ph.D. Dissertation Defense by Jack Lanchantin Tuesday, July 20th, 2021 at 2:00 PM (ET), via Zoom. Committee: Vicente Ordóñez Román, Committee Chair, ...

ACM BCB - Transfer Learning for Predicting Virus-Host Protein Interactions for Novel Virus Sequences

2 minute read

CVPR - General Multi-label Image Classification with Transformers

1 minute read

Title: General Multi-label Image Classification with Transformers

AAAI - Curriculum Labeling- Self-paced Pseudo-Labeling for Semi-Supervised Learning

1 minute read

Title: Curriculum Labeling- Self-paced Pseudo-Labeling for Semi-Supervised Learning”

NeurIPS - Measuring Visual Generalization in Continuous Control from Pixels

less than 1 minute read

EMNLP - Benchmarking Search Algorithms for Generating NLP Adversarial Examples

1 minute read

Title: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

EMNLP- On Quality of Generated Adversarial Examples and How to Set Attack Contraints

1 minute read

Title: Reevaluating Adversarial Examples in Natural Language

EMNLP - TextAttack- A Framework for Adversarial Attacks in Natural Language Processing

less than 1 minute read

Bioinformatics - FastSK- Fast Sequence Analysis with Gapped String Kernels

1 minute read

Beilun’s PhD Defense - Fast and Scalable Joint Estimators for Learning Sparse Gaussian Graphical Models from Heterogeneous Data with Additional Knowledge

1 minute read

PhD Defense Presentation by Beilun Wang Friday, July 20, 2018 at 9:00 am in Rice 242 Committee Members: Mohammad Mahmoody (Chair), Yanjun Qi (Advisor), ...

MLCB - Prototype Matching Networks for Large-Scale Multi-label Genomic Sequence Classification

1 minute read

ICLR - Memory Matching Networks for Genomic Sequence Classification

less than 1 minute read

AAAI - MUST-CNN- A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction

less than 1 minute read

Tool MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction

Qdata

Deep Learning's Generalization, Especially on structured discrete data

Background on why Generalization topics of Deep Learning are interesting?

Why structured discrete Data is Interesting?

Relevant Papers we published

Contacts:

Zhe’s PhD Defense - Toward Out-Of-Distribution Generalization Of Deep Learning Models

Arsh’s PhD Defense - Relational Structure Discovery for Deep Learning

JackL’s PhD Defense - Modeling interactions with Deep Learning

ACM BCB - Transfer Learning for Predicting Virus-Host Protein Interactions for Novel Virus Sequences

CVPR - General Multi-label Image Classification with Transformers

AAAI - Curriculum Labeling- Self-paced Pseudo-Labeling for Semi-Supervised Learning

NeurIPS - Measuring Visual Generalization in Continuous Control from Pixels

EMNLP - Benchmarking Search Algorithms for Generating NLP Adversarial Examples

EMNLP- On Quality of Generated Adversarial Examples and How to Set Attack Contraints

EMNLP - TextAttack- A Framework for Adversarial Attacks in Natural Language Processing

Bioinformatics - FastSK- Fast Sequence Analysis with Gapped String Kernels

Beilun’s PhD Defense - Fast and Scalable Joint Estimators for Learning Sparse Gaussian Graphical Models from Heterogeneous Data with Additional Knowledge

MLCB - Prototype Matching Networks for Large-Scale Multi-label Genomic Sequence Classification

ICLR - Memory Matching Networks for Genomic Sequence Classification

AAAI - MUST-CNN- A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction

NeurIPS- Learning the Dependency Structure of Latent Factors

ICLR - Unsupervised Feature Learning by Deep Sparse Coding

ECIR - Deep Learning for Character-based Information Extraction on Chinese and Protein Sequence

Plos- A unified multitask architecture for predicting local structural properties on proteins

NeurIPS - Deep Metric Learning to Learn and to Use

CIKM - Document classification with weighted supervised n-gram embedding

ECML - Systems and methods for semi-supervised relationship extraction

ICDM- Semi-Supervised Sequence Labeling with Self-Learned Feature

Bioinformatics - Semi-supervised multi-task learning Using BioText based Labels to Augument PPI Prediction