Abstract:
Mining frequent patterns is an important component of many prediction systems. One common usage in web applications is the mining of users’ access behavior for the purpose of predicting and hence pre-fetching the web pages that the user is likely to visit.
This paper presents web usage mining model for discovering frequent patterns in sequence databases that requires only two database scans. The first scan obtains support counts for subsequences of length. The second scan extracts potentially frequent sequences tree structure (FS-tree). Frequent sequence patterns are generated by mining the FS-tree. On the other hand, clustering methods are unsupervised methods, and normally are not used for classification directly. This paper involves incorporating clustering with FS-tree algorithm. The pre-processed data is divided into meaningful clusters then the clusters are used as training data for the FS-tree algorithm, to get higher accuracy.