Abstract

Examining Extractive Text Summarization Methods with CNN Stories


Abstract


Because of exponential growth of content in digital format on the Internet, text summarization emerged as an important study area. This study presents an evaluation based on comparison of different extractive summarization techniques, namely Feature-based, Frequency-based, Graphbased and LSA based on the DeepMind CNN Stories dataset. The performance of these techniques was assessed using ROUGE metrics. The findings of our study indicate that the usage of the feature-based approach yielded superior results compared to the other techniques. This was evidenced by the attainment of the highest F1 scores in relation to ROUGE-1. Furthermore, the approach based on features generated summaries that were more comprehensible and informative compared to the other methodologies. The present investigation offers valuable perspectives on the efficacy of diverse extractive methods for summarization and emphasizes the capability of the feature-based approach to produce summaries of high quality.




Keywords


Text Summarization; Extractive Text Summarization; DeepMind CNN; ROUGE; TF-IDF; LSA; TextRank; Feature Scoring