A Comprehensive Study of Text Summarization with Advent of Large Language Models

Main Article Content

Dr. Geetanjali Vinayak Kale,Prof. Pranali Rajendra Navghare,Atharva Sadanand Litake, Apurva Ajit Kulkarni, Kishanlal Chhelaram Choudhary , Aditya Ashru Darade,

Abstract

Introduction: Communication is at the heart of the human race. With the growth of social media and other communication platforms, the globe is now connected at a single click. People communicate and tend to share information through these platforms. A massive amount of data is being generated and being analysed every second. To tackle the problem of analysing Big Data and withdraw insights from it, is a difficult task. Text summarization is the process of concise representation of textual data so as to extract the most important information out of it. Text summarization plays a major role in analysing big data and taking decisions based upon the insights drawn. With the advent of Large Language Models, the techniques used for summarization have been enhanced to a large extent. The following paper surveys old techniques for text summarization and studies the new methodologies using Large Language Models. The paper aims to deliver the most up-to-date survey of text summarization and enhancement in it using Large Language Models.


Objectives: The objective of this paper is to provide a comprehensive study on text summarization, focusing on its evolution and the advancements brought by Large Language Models (LLMs). It aims to analyse the development of summarization techniques, including both extractive and abstractive methods, while exploring the mathematical algorithms and machine learning models that underpin these approaches. Additionally, the paper discusses the impact of natural language processing (NLP) advancements, particularly LLMs, in enhancing the accuracy and efficiency of summarization. A comparative analysis of traditional and modern approaches is presented, evaluating their effectiveness using various datasets. Furthermore, the study highlights key research contributions in the field and identifies current challenges, paving the way for future innovations in text summarization.


Methods: Text summarization research has evolved significantly, exploring extractive and abstractive methods using machine learning and LLMs. Mark Dredze et al. used LSA and LDA for email summarization, while Mohamed Abdel Fattah et al. trained ML models for sentence extraction. Jan Ulrich et al. applied regression-based learning for email thread summaries. Pete Burnap et al. focused on summarizing real-world events from social media.


Derek Miller et al. used BERT and K-Means for lecture summarization, and Rahim Khan et al. leveraged K-Means and TF-IDF for news summarization. Mingxi Zhang et al. optimized TextRank for keyword extraction. Jingqing Zhang et al. introduced PEGASUS for abstractive summarization with advanced pre-training techniques. Zhang et al. also explored long-dialogue summarization using retrieval-based and hierarchical encoding methods.


Conclusions: The advent of Large Language Models (LLMs) has significantly advanced text summarization by addressing the shortcomings of earlier extractive and abstractive techniques. Traditional extractive methods often produced fragmented summaries, while early abstractive approaches struggled with coherence and redundancy. LLMs, powered by transformers and self-attention mechanisms, have enabled more fluent, contextually aware, and human-like summaries. These models can effectively capture long-range dependencies, rephrase content, and generate concise yet meaningful summaries. As a result, LLMs have expanded the scope of text summarization, making it more applicable and reliable across various domains, including news, research, and automated content generation.

Article Details

Section
Articles