Self-Similar Layered Hidden Markov Model (SSLHMM)

 

Hidden Markov Models (HMM) have proven to be useful in a variety of real world applications where considerations for uncertainty are crucial. Such an advantage can be more leveraged if HMM can be scaled up to deal with complex problems. In this paper, we introduce, analyze and demonstrate Self-Similar Layered HMM (SSLHMM), for a certain group of complex problems which show self-similar property, and exploit this property to reduce the complexity of model construction. We show how the embedded knowledge of self-similar structure can be used to reduce the complexity of learning and increase the accuracy of the learned model. Moreover, we introduce three different types of self-similarity in SSLHMM, and investigate their performance in the context of synthetic data and real-world network databases. We show that SSLHMM has several advantages comparing to conventional HMM techniques and it is more efficient and accurate than one-step, flat method for model construction.

There is a vast amount of natural structures and physical systems which contain self-similar structures that are made through recurrent processes. To name a few: ocean flows, changes in the yearly flood levels of rivers, voltages across nerve membranes, musical melodies, human brains, economic markets, Internet web logs and network data create enormously complex self-similar data. While there have been much effort on observing self-similar structures in scientific databases and natural structures, there are few works on using self-similar structure and fractal dimension for the purpose of data mining and predictive modeling. Among these works, using fractal dimension and self-similarity to reduce the dimensionally curse, learning association rules and applications in spatial joint selectivity in databases are considerable. In this paper we introduce a novel technique which uses the self-similar structure for predictive modeling using a Self-Similar Layered Hidden Markov Model (SSLHMM). Despite the broad range of application areas shown for classic HMMs, they do have limitations and do not easily handle problems with certain characteristics. For instance, classic HMM has difficulties to model complex problems with large states spaces. Among the recognized limitations, we only focus on complexity of HMM for a certain category of problems with the following characteristics: 1) The uncertainty and complexity embedded in these applications make it difficult and impractical to construct the model in one step. 2) Systems are self-similar, contain self-similar structures and have been generated through recurrent processes. For instance, analysis of traffic data from networks and services such as ISDN traffic, Ethernet LAN’s, Common Channel Signaling Network (CCNS) and Variable Bit Rate (VBR) video have all convincingly demonstrated the presence of features such as self-similarity, long range dependence, slowly decaying variances, heavy-tailed distributions and fractal dimensions. In a companion paper, Adibi and Shen introduced a domain independent novel technique to mine sequential databases through Mining by Layered Phases (MLP) in both discrete and continuous domains. In this work we introduce a special form of MLP as Self-Similar Layered HMM (SSLHMM) for self-similar structures. We show how SSLHMM uses the information embedded in a self-similar structure to reduce the complexity of the problem and learn a more accurate model than a general HMM. Our result is encouraging and show a significant improvement when a self-similar data are modeled through SSLHMM in comparison with HMM.