Please login first
Two Stage Extractive Text Summarization
* 1 , 1 , 2 , 2
1  Department of Computer Science, Aliko Dangote University of Science and Technology, Kano, Nigeria
2  Department of Software Engineering, Northwest University, Kano, Nigeria
Academic Editor: Lucia Billeci

Abstract:

The rapid growth of digital text highlights the need for effective summarization. Traditional graph based methods, like TextRank, often fall short by relying primarily on lexical similarity, which can miss crucial semantic connections and deeper contextual meaning. This study proposes a two stage summarization framework that integrates an enhanced graph-based ranking mechanism with a metaheuristic optimization strategy. In the initial phase, we modified the conventional TextRank algorithm by redefining edge weights through a combination of lexical, structural, and semantic attributes, specifically sentence position, bigram overlap, and SBERT based semantic similarity. This multi feature integration enhances the estimation of sentence significance by effectively capturing both surface level and contextual relationships. In the subsequent phase, we present a refined Snake Optimization Algorithm that identifies optimal subset of sentences through the application of a fitness function. This function integrates ROUGE-1, ROUGE-2, ROUGE-L metrics, SBERT-based semantic similarity aligned with the reference summary, as well as a sentence threshold penalty to control redundancy and length. Findings on the Medium Article datasets demonstrate improved summarization quality in terms of both lexical and semantic metrics, validating the effectiveness of the proposed two stage strategy. This research significantly contributes to the advancement of extractive summarization models by combining graph based ranking with semantically informed optimization.

Keywords: Extractive Summarization, TextRank, Snake Optimization Algorithm, Sentence Ranking , Metaheuristic Optimization, Natural Language Processing
Comments on this paper
Currently there are no comments available.


 
 
Top