Scalable Fake News Detection: Implementing NLP and Embedding Models for Large-Scale Data

Surjeet

doi:10.52783/cana.v32.2991

PDF

Published: Dec 25, 2024

DOI: https://doi.org/10.52783/cana.v32.2991

Keywords:

news, detection, system, embeddings, word, backdrop, combination, NLP, BERT, automated, tokenization, classification.

Surjeet, B.Vasavi, Padmesh Tripathi, Vakaimalar Elamaran, Ramya R, Anandh A

Abstract

This paper describes a scalable approach to fake news detection by employing Natural Language Processing and word embedding models for huge datasets. The collective work with different embeddings (Bag of Words, TF-IDF, Word2Vec, and Bidirectional Encoder Representations from Transformers BERT), extracting not only word frequency but also content relation in news articles. These embeddings are then combined with machine learning classifiers including logistic regression, random forests and neural networks to evaluate how different models perform. It is a scalable system using distributed processing frameworks to process large amounts of data and to enable large scale model training. Our methodology with widely adopted fake news datasets including PolitiFact and the LIAR dataset show superior classification results, in particular when employing deep learning-based embeddings such as BERT which outperforms traditional methods by accuracy and recall. The authors investigate the effect of text preprocessing methods (e.g. stop-word removal, tokenization) on classification results. Our findings call attention to the trade-offs required for launching large-scale fake news detection systems given a balance between model complexity and computational efficiency.

Issue

Vol. 32 No. 5s (2025)

Section

Articles

Announcements

Call for Papers

Call for Papers for the Upcoming Issue.

Last Date of Submission: June 30^th, 2025

Call for Reviewers

Call for Editorial Member/ Reviewers Submitting your Application
If you would like to apply for the position of an Editorial Board Member on the journal, please contact the Editor including your CV and a brief covering letter detailing why you are a suitable candidate, to editor@internationalpubls.com. Your cover letter should be no longer than one page and should cover where you believe the research field is going (and the journal's place within it), as well as details of any previous relevant journal editorial and peer review management experience.