Design and Implementation of an Integrated Data Mining Framework for Efficient Knowledge Discovery in Large-Scale Data Warehousing Systems
Keywords:
Data Mining, Data Warehousing, Knowledge Discovery, Big Data, Hybrid Algorithms, Modular Architecture, OLAP, ETL, ScalabilityAbstract
The exponential growth of data across various sectors necessitates the integration of advanced data mining techniques within data warehousing systems. This paper presents a novel, scalable framework designed to unify the processes of data collection, preprocessing, mining, and interpretation to extract actionable knowledge. By leveraging hybrid algorithms and a modular architecture, the proposed framework enhances both performance and scalability, making it suitable for enterprise-level deployments. The study outlines the architecture, implementation, and evaluation of the framework through empirical experiments on benchmark datasets.
References
Han, Jiawei, and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2006.
Inmon, W. H. Building the Data Warehouse. Wiley, 2005.
Agrawal, Rakesh, et al. “Mining Association Rules between Sets of Items in Large Databases.” ACM SIGMOD, 1993.
Jindal, Nitin, and Bing Liu. “Review Spam Detection.” WWW, 2010.
Chen, Min, et al. “Big Data: A Survey.” Mobile Networks and Applications, 2014.
Zikopoulos, Paul, et al. Understanding Big Data. McGraw-Hill, 2012.
Witten, Ian H., et al. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2011.
Chaudhuri, Surajit, and Umeshwar Dayal. “An Overview of Data Warehousing and OLAP Technology.” ACM SIGMOD Record, 1997.
Zhang, Qi, et al. “Cloud Computing: State-of-the-Art and Research Challenges.” Journal of Internet Services and Applications, 2010.
Dean, Jeffrey, and Sanjay Ghemawat. “MapReduce: Simplified Data Processing on Large Clusters.” Communications of the ACM, 2008.
Fayyad, Usama, et al. “From Data Mining to Knowledge Discovery in Databases.” AI Magazine, 1996.
Abadi, Daniel J., et al. “Column-Stores vs. Row-Stores: How Different Are They Really?” ACM SIGMOD, 2008.
Jagadish, H. V., et al. Big Data and Its Technical Challenges. Communications of the ACM, vol. 57, no. 7, 2014, pp. 86–94.
Kotsiantis, Sotiris B. Supervised Machine Learning: A Review of Classification Techniques. Informatica, vol. 31, no. 3, 2007, pp. 249–268.
Mohania, M. K., et al. Data Integration and Warehousing Challenges for Business Intelligence. International Conference on Data Engineering, IEEE, 2009, pp. 1559–1562.
Chen, Chun-Wei. Applications of Data Mining in the Retail Industry. Journal of Retailing and Consumer Services, vol. 16, no. 2, 2009, pp. 101–106.
Published
Issue
Section
License
Copyright (c) 2025 Chike Obi (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.