is up and running!

The website introduces a suite of deep learning tools we have developed for learning patterns and making predictions from data sets in biomedicine.

Background: Representation Learning and Deep Learning

The performance of machine learning algorithms is largely dependent on the data representation (or features) on which they are applied. Deep learning aims at discovering learning algorithms that can find multiple levels of representations directly from data, with higher levels representing more abstract concepts. In recent years, the field of deep learning has lead to groundbreaking performance in many applications such as computer vision, speech understanding, natural language processing, and computational biology.

This web: our deep learning and representation learning tools to learn and predict on data from biomedicine.

Recent advances in next-generation sequencing have allowed biologists to profile a significant amount of DNA sequences, gene expression and chromatin patterns across many cell types covering the full human genome. These datasets have been made available through large-scale repositories, like ENCODE, REMC and TCGA. Processing and understanding this repository of “big” data has posed a number of computational challenges that conventional bioinformatics analysis can not handle.

We have designed novel and robust representation-learning and deep learning algorithms to process this flood of genome-wide datasets.

No. Tool Name BioData Type Short Description
0 AttentiveChrome Epigenomics Deep-learning for predicting gene expression from histone modifications
1 DeepChrome Epigenomics Attend and Predict: Using Deep Attention Model to Understand Gene Regulation by Selective Attention on Chromatin
2 DeepMotif Functional Genomics Visualizing and Understanding Genomic Sequences Using Deep Neural Networks
3 Memory Matching Net Functional Genomics Memory Matching Networks for Genomic Sequence Classification
4 MUST-CNN Protein Tagging A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction
5 MultitaskProteinTag Protein Tagging A unified multitask architecture for predicting local protein properties
6 GakCo-SVM biomedical sequences a Fast GApped k-mer string Kernel using COunting
7 TransferSK-SVM Functinoal Genomics Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction


Have questions or suggestions? Feel free to ask me on Twitter or email me.

Thanks for reading!

View Posts Feed