Finding memory in time

Friday, April 13, 2018, 3:00 pm - 4:00 pm PSTiCal
Conf. Rm #1135
This event is open to the public.
NL Seminar
Yuanhang Su (USC)

Abstract: For a large number of natural language processing (NLP) problems, we are concerned with finding semantic patterns from input sequences. In recurrent neural network (RNN) based approach, such pattern is “encoded” in a vector called hidden state. Since Elman’s “Finding structure in time” published in 1990, it has long been believed that the “magic power” of RNN’s memory, which is enclosed inside the hidden state, can handle very long sequences. Yet besides some experimental observations, there is no formal definition of RNN’s memory, let alone a rigid mathematical analysis of how RNN’s memory forms.

This talk will focus on understanding memory from two viewpoints. The first viewpoint is that memory is a function that maps certain elements in the input sequences to the current output. Such definition, for the first time in literature, allows us to do detailed analysis of the memory of simple RNN (SRN), long short-term memory (ELSTM), and gated recurrent unit (GRU). It also opens the door for further improving the existing RNN basic models. The end results are the proposal of a new basic RNN model called extended LSTM (ELSTM) with outstanding performance for complex language tasks, and a new macro RNN model called dependent bidirectional RNN (DBRNN) with smaller cross entropy than bidirectional RNN (BRNN) and encoder-decoder (enc-dec) models. The second viewpoint is that memory is a compact representation of sparse sequential data. From this perspective, the process of generating hidden state of RNN is simply dimension reduction. Thus, method like principal component analysis (PCA) which does not require labels for training becomes attractive. However, there are two known problems in implementing PCA for NLP problems: the first is computational complexity; the second is vectorization of sentence data for PCA. To deal with this problem, an efficient dimension reduction algorithm called tree -structured multi-linear PCA is proposed.

Bio: Yuanhang Su received the dual B.S. degree in Electrical Engineering & Automation and Electronic & Electrical Engineering from University of Strathclyde, Glasgow, U.K. and Shanghai University of Electric Power, Shanghai, China, respectively in 2009, and the M.S. degree in Electrical Engineering from the University of Southern California, Los Angeles, CA, in 2010. From 2011 to 2015, he worked as image/video/camera software and algorithm engineer for a Los Angeles startup named Exaimage, Shanghai Aerospace Electronics Technology Institute in China and Huawei Technology in China consecutively. He joined MCL lab in 2016 spring, and is currently pursing his Ph.D. in computer vision, natural language processing and machine learning.

« Return to Upcoming Events