The performance of machine learning algorithms is largely dependent on the data representation (or features) on which they are applied. Deep learning aims at discovering learning algorithms that can find multiple levels of representations directly from data, with higher levels representing more abstract concepts. In recent years, the field of deep learning has lead to groundbreaking performance in many applications such as computer vision, speech understanding, natural language processing, and computational biology.
Deep learning constructs networks of parameterized functional modules and is trained from reference examples using gradient-based optimization [Lecun19].
Since it is hard to estimate gradients through functions of discrete random variables, researching on how to make deep learning behave well on discrete data and discrete representation interests us. Developing such techniques are an active research area. We focus on investigating interpretable and scalable techniques for doing so.
Have questions or suggestions? Feel free to ask me on Twitter or email me.
Thanks for reading!