Multidimensional Multi-granularities Data Mining for Discover Association Rule
Data Mining is one of the most significant tools for discovering association patterns for many knowledge domains. Yet, there are deficits of current data-mining techniques, i.e.: 1) current methods are based on plane-mining using pre-defined schemata so that a re-scanning of the entire database is required whenever new attributes are added. 2) An association rule may be true on a certain granularity but false on a smaller ones and vise verse. 3) Existing methods can only find either frequent rules or infrequent rules, but not both at the same time.This paper proposes a novel algorithm alone with a data structure that together solves the above weaknesses at the same time. Thus, the proposed approach can improve the efficiency and effectiveness of related data mining approach. By means of the data structure, we construct a forest of concept taxonomies which can be applied for representing the knowledge space. On top of the concept taxonomies, the data mining is developed as a compound process to find the large-itemsets, to generate, to update and to output the association patterns that can represent the composition of various taxonomies. This paper also derived a set of benchmarks to demonstrate the level of efficiency and effectiveness of the data mining algorithm. Last but not least, this paper presents the experimental results with respect to efficiency, scalability, information loss, etc. of the proposed approach to prove its advantages.
R. Agrawal and J. C. Shafer (1996). “Parallel Mining of Association Rules,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 6, pp. 962-969.
R. Agrawal and R. Srikant (1994). “Fast Algorithms for Mining Association Rules in Large Databases,” in Proceedings of the 20th International Conference on Very Large Data Bases.
R. Agrawal, T. Imielinski and A. N. Swami (1993). “Mining Association Rules between Sets of Items in Large Databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.
J. K. Chiang (2007). “Developing an Approach for Multidimensional Data Mining on various Granularities ~ on Example of Financial Portfolio Discovery,” in ISIS 2007 Proceedings of the 8th Symposium on Advanced Intelligent Systems, Sokcho City, Korea.
J. K. Chiang and J. C. Wu (2005). “Mining Multi-Dimension Rules in Multiple Database Segmentation-on Examples of Cross Selling,” in Proceedings of the 16th International Conference on Information Management, Taipei, Taiwan.
T. M. Cover and J. A. Thomas (2006). Elements of Information Theory, 2nd ed., Wiley.
R. Feldman and J. Sanger (2007). The Text Mining Handbook – Advanced Approaches in Analyzing Unstructured Data, Cambridge University Press.
J. Han and M. Kamber (2006). Data Mining - Concepts and Techniques, 2nd ed., Morgan Kaufman.
L. J. He, L. C. Chen and S. Y. Liu (2003) “Improvement of AprioriTid Algorithm for Mining Association Rules,” Journal of Yantai University(Natural Science and Engineering Edition), vol. 16, no. 4.
B. Lent, A. Swami and J. Widom (1997). “Clustering Association Rules,” in Proceedings of the 13th International Conference on Data Engineering.
M. Li and M. Baker (2005). The GRID – Core Technologies, Wiley.
B. Liu, W. Hsu and Y. Ma (1999), “Mining Association Rules with Multiple Minimum Supports,” in Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
G. Shmueli, N. R. Patel and P. C. Bruce (2007).“Association Rules,” in Data Mining for Business Intelligence, Concepts, Techniques, and Applications, Wiley, pp. 203-215.
R. Srikant and R. Agrawal (1995). “Mining Generalized Association Rules,” in Proceedings of the 21th International Conference on Very Large Data Bases, Zurich, Switzerland.
W. Stallings (2004). “Channel Capacity,” in Business Data Communications, 6th ed., Pretice Hall, pp. 470-471.
P. S. Tsai and C. M. Chen (2004). “Mining interesting association rules from customer databases and transaction databases,” Information Systems, vol. 29, no. 8, p. 685–696.
C. Vercellis (2009). “Association Rules,” in Business Intelligence, Data Mining and optimization for Decision Making, Wiley, pp. 277-290.
The CRISP-DM Consortium, CRISP-DM 1.0 (2000), www.crisp-dm.org.