Keywords:-
Article Content:-
Abstract
The challenge of summarization arises from the fact that computers lack the ability to comprehend human language and the emotions conveyed within it. To address this challenge, several machine learning models have been trained, tested and used in a manner that encompasses crucial information. Researchers were drawn to various aspects of artificial text summarization due to its ability to efficiently extract relevant information in a shorter timeframe and its uses encompasses the generation of concise summaries from many sources such as emails, news articles, mobile news, and corporate information. One hundred thousand (100,000) records of case related to heart disease were fetched from Kaggle online Dataset for Heart Disease, the data were contained in different types of documents They were fed into Naïve Bayes algorithm for proper data cleansing and treatment before classification into 70% training and 30% testing because of the status of the model. Association rule mining was used to collaborate to achieve very reliable summaries. The enhance Naive Bayes classifier was used as initiator to provide set of relevance scores by analyzing word frequencies and other document features. These scores were then used as part of the broader summarization framework, the two output were combined to generate a comprehensive and meaningful summary. Performance evaluation were carried out. The performance evaluation metrics adopted in the study measures the model’s accuracy, precision, recall and F1 score, the model, combines both enhanced Naive Bayes (e-NB) and Association Rule Mining (ARM), delivering superior performance across all the metrics.
Overall performance of (e-NB) - ARM outshines NB-ARM in all metrics, including accuracy, precision, recall, F1-score, and time complexity, the hybridized model improves predictive accuracy by combining the probabilistic approach of Naive Bayes with the insightful patterns generated by Association Rule Mining, allowing for more robust predictions, accurate and interpretable insights of heart disease making it the preferable choice between the two models. Therefore, adopting the enhanced hybrid Naive Bayes and Association Rule Mining (e-NB) - ARM system for multi-document summarization is the best.
Conclusively, this approach not only enhances the efficiency and quality of multi-document summarization in high-volume textual data environments but also has applications in diverse fields. This study has contributed to the growing need for advanced text summarization methods, facilitating faster access to essential information and supporting more informed decision-making across disciplines.
References:-
References
Abdelaleem, N., Elkader, H. A., Salem, R., Salama, D. D., and Elminaam, A. (2020, November). Extractive Text Summarization using Neural Network. In Proceedings of the 36th International Business Information Management Association (IBIMA). 13119-13131
Abid, A. M. (2022). Multi-Document Text Summarization Using Deep Belief Network. International Journal of Advances in Scientific Research and Engineering (IJASRE), 8(8), 56-65.
Alanzi, E., & Alballaa, S. (2023). Query-Focused Multi-document Summarization Survey. International Journal of Advanced Computer Science and Applications, 14(6)
Alias, S., Majalin, M., and Hayatin, N. (2023, August). A Visualized Hybrid Keyword-Cluster Approach for Extractive Text Summarizer Tool for STEM Education in Malaysia. In 2023 IEEE 8th International Conference on Software Engineering and Computer Systems (ICSECS) (pp. 139-144). IEEE.
Am, P. (2021, May). An Efficient Domain-Specific Text Summarization Using Combined Statistical and Linguistic Methods. In Proceedings of the International Conference on Smart Data Intelligence (Pp.154-163) (ICSMDI 2021).
Anand, D., and Wagh, R. (2022). Effective deep learning approaches for summarization of legal texts. Journal of King Saud University-Computer and Information Sciences, 34(5), 2141-2150.
Anushka, R. L., Jagadish, S., Satyanarayana, V., & Singh, M. K. (2021, October). Lens less cameras for face detection and verification. In 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC) (pp. 242-246). IEEE.
Aote, S. S., Pimpalshende, A., Potnurwar, A., and Lohi, S. (2023). Binary Particle Swarm Optimization with an improved genetic algorithm to solve multi-document text summarization problem of Hindi documents. Engineering Applications of Artificial Intelligence, 117, 105575.
Bao, C., Liu, X., Zhang, H., Li, Y., & Liu, J. (2020). Coronavirus disease 2019 (COVID-19) CT findings: a systematic review and meta-analysis. Journal of the American college of radiology, 17(6), 701-709.
Belwal, R. C., Rai, S., and Gupta, A. (2021). A new graph-based extractive text summarization using keywords or topic modeling. Journal of Ambient Intelligence and Humanized Computing, 12(10), 8975-8990.
Bhat, P., Anuse, A., Kute, R., Bhadade, R. S., and Purnaye, P. (2022). Mental health analyzer for depression detection based on textual analysis. Journal of Advances in Information Technology Vol, 13(1). 67-77
Bidoki, M., Moosavi, M. R., and Fakhrahmad, M. (2020). A semantic approach to extractive multi-document summarization: Applying sentence expansion for tuning of conceptual densities. Information Processing and Management, 57(6), 102-217.
Curiel, A., Gutiérrez-Soto, C., Soto-Borquez, P. N., and Galdames, P. (2020, November). Measuring the Effects of Summarization in Cluster-based Information Retrieval. In 2020 39th International Conference of the Chilean Computer Science Society (SCCC) (pp. 1-8). IEEE.
D’Silva, J., and Sharma, U. (2023). Automatic text summarization of Konkani Folk tales using supervised machine learning algorithms and language independent features. IETE Journal of Research, 69(9), 6162-6175.
Dou, Y., Forbes, M., Koncel-Kedziorski, R., Smith, N. A., & Choi, Y. (2021). Is GPT-3 text indistinguishable from human text? scarecrow: A framework for scrutinizing machine text. arXiv preprint arXiv:2107.01294.
Du, C., Li, Y., Qiu, Z., & Xu, C. (2024). Stable diffusion is unstable. Advances in Neural Information Processing Systems, 36.
El-Kassas, W. S., Salama, C. R., Rafea, A. A., and Mohamed, H. K. (2021). Automatic text summarization: A comprehensive survey. Expert systems with applications, 165, 113679.
Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., and Hovy, E. (2021). A survey of data augmentation approaches for NLP. arXiv preprint arXiv:2105.03075.
Guan, W., Smetannikov, I., & Tianxing, M. (2020, October). Survey on automatic text summarization and transformer models applicability. In Proceedings of the 2020 1st International Conference on Control, Robotics and Intelligent System (pp. 176-184).
Hassan, S. U., Ahamed, J., & Ahmad, K. (2022). Analytics of machine learning-based algorithms for text classification. Sustainable Operations and Computers, 3, 238-248.
Inui, K., Jiang, J., Ng, V., & Wan, X. (2019, November). Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
Kumar, Y., Kaur, K., and Kaur, S. (2021). Study of automatic text summarization approaches in different languages. Artificial Intelligence Review, 54(8), 5897-5929
Lewis, D. (2020). Mounting evidence suggests coronavirus is airborne—but health advice has not caught up. Nature, 583(7817), 510-513.
Liu, Y., Titov, I., and Lapata, M. (2019). Single document summarization as tree induction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 1745-1755).
Ma, C. (2024). Deep Learning Based Multi-document Summarization (Doctoral dissertation).
Ma, Y., Xie, Z., Li, G., Ma, K., Huang, Z., Qiu, Q., & Liu, H. (2022). Text visualization for geological hazard documents via text mining and natural language processing. Earth Science Informatics, 1-16.
Ma, C., Zhang, W. E., Guo, M., Wang, H., and Sheng, Q. Z. (2022). Multi-document summarization via deep learning techniques: A survey. ACM Computing Surveys, 55(5), 1-37.
Mridha, M. F., Lima, A. A., Nur, K., Das, S. C., Hasan, M., and Kabir, M. M. (2021). A survey of automatic text summarization: Progress, process and challenges. IEEE Access, 9, 156043-156070.
Myla, S. D., Saini, E. R., and Kapoor, E. N. (2024, January). Auto Text Summarization in Natural Language Processing. In 2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT) (pp. 1258-1267). IEEE.
Nguyen, T., Luu, A. T., Lu, T., and Quan, T. (2021). Enriching and controlling global semantics for text summarization. arXiv preprint arXiv:2109.10616.
Ni, A., Azerbayev, Z., Mutuma, M., Feng, T., Zhang, Y., Yu, T., ... and Radev, D. (2021). SummerTime: Text summarization toolkit for non-experts. arXiv preprint arXiv:2108.12738.
Pagnoni, A., Balachandran, V., and Tsvetkov, Y. (2021). Understanding factuality in abstractive summarization with FRANK: A benchmark for factuality metrics. arXiv preprint arXiv:2104.13346.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140), 1-67.
Raza, H., & Shahzad, W. (2024). End to End Urdu Abstractive Text Summarization With Dataset and Improvement in Evaluation Metric. IEEE Access.
Razumovskaia, E., Glavaš, G., Majewska, O., Korhonen, A., and Vulic, I. (2021). Crossing the conversational chasm: A primer on multilingual task-oriented dialogue systems. arXiv preprint arXiv:2104.08570.
Roy, P., and Kundu, S. (2023). Review on Query-focused Multi-document Summarization (QMDS) with Comparative Analysis. ACM Computing Surveys, 56(1), 1-38.
Sanchez-Gomez, J. M., Vega-Rodríguez, M. A., and Pérez, C. J. (2022). A multi-objective memetic algorithm for query-oriented text summarization: Medicine texts as a case study. Expert Systems with Applications, 198, 116769.
See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
Song, F., Shi, N., Shan, F., Zhang, Z., Shen, J., Lu, H., ... & Shi, Y. (2020). Emerging 2019 novel coronavirus (2019-nCoV) pneumonia. Radiology, 295(1), 210-217.
Waly, R. R., and Gomaa, W. H. (2022, May). Extractive Summarization of Scientific Articles. In 2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) (pp. 349-354). IEEE.
Widyassari, A. P., Rustad, S., Shidik, G. F., Noersasongko, E., Syukur, A., and Affandy, A. (2022). Review of automatic text summarization techniques and methods. Journal of King Saud University-Computer and Information Sciences, 34(4), 1029-1046.
Yadav, A. K., Ranvijay, Yadav, R. S., & Maurya, A. K. (2024). Graph-based extractive text summarization based on single document. Multimedia Tools and Applications, 83(7), 18987-19013.
Zhang, T., Ladhak, F., Durmus, E., Liang, P., McKeown, K., and Hashimoto, T. B. (2024). Benchmarking large language models for news summarization. Transactions of the Association for Computational Linguistics, 12, 39-57.
Zhang, S., and Bansal, M. (2021). Finding a balanced degree of automation for summary evaluation. arXiv preprint arXiv:2109.11503.
Zhang, J., Lu, H., Zeng, H., Zhang, S., Du, Q., Jiang, T., & Du, B. (2020). The differential psychological distress of populations affected by the COVID-19 pandemic. Brain, behavior, and immunity, 87, 49.
Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning-based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52(1), 1-38.